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Our  research  this  past  year  concentrated  on  probabilistic  techniques 
for  algebraic  algorithms,  such  as  the  Greatest  Common  Division  Algorithm. 
These  probabilistic  algorithms  are  developed  in  Richard  Zippel's  doctoral 
thesis,  a  revised  version  of  which  is  enclosed.  These  algorithms  are  in 
some  cases  exponentially  better  than  all  other  known  algorithms  and  are  as 
close  to  the  best  algorithms  one  can  expect  for  a  large  class  of  practical 
problems  so  as  to  essentially  cause  work  in  this  area  to  cease. 

Research  on  algebraic  alorithms  (GCD,  factorization  of  polynomials, 
resultants,  determinants,  solutions  of  equations)  have  been  going  on  for 


twenty  years.  The  first  approaches  were  relatively  straight-forward 
approaches  that,  in  the  case  of  the  GCD  algorithms,  were  variants  of 
Euclid's  algorithm.  These  are  dependent  on  divisions  of  polynomials  or 
pseudo  divisions  and  had  the  unfortunate  property  of  increasing  the  size 
of  intermediate  results  by  exponential  orders.  These  unbearable  effects 
of  the  straight-forward  approaches  were  first  combated  with  a  class  of 
algorithms  called  modular  algorithms  and  popularized  by  G.  Collins.  The 
modular  algorithms  are  optimal  when  the  polynomials  are  dense  (and  thus 
have  an  exponential  number  of  terms  in  multivariate  problems).  They, 
unfortunately,  turn  out  to  assume  that  all  problems  are  dense  and  hence  are 
exponentially  slow  in  practical  cases  (i.e.,  multivariate  and  sparse). 
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The  major  attack  on  the  modular  approaches  was  the  use  of  Hensel's 
lemma  by  Moses  and  Yun.  Hensel's  lemma  is  the  algebraic  variant  of 
Newton's  method.  This  approach,  dubbed  the  EZGCD  algorithm  in  the  case 
of  GCD,  worked  fine  in  many  cases  but  failed  when  the  derivative  became  0. 
Whereas  Newton's  method  only  slows  down  at  such  a  point,  the  use  of  Hensel's 
lemma  forced  us  to  expand  the  problem  exponentially.  This  problem,  called 
the  "bad  zero"  problem,  was  the  one  solved  by  Zippel's  probabilistic  approach. 

Zippel  shows  that  the  derivative  goes  to  zero  at  an  infinite  number  of 
points,  but  the  probability  that  a  randomly  chosen  point  is  such  a  "bad"  point 
be  made  as  low  as  one  pleases.  Zippel  also  generalizes  Hensel's  lemma  so 
that  one  can  start  it  at  any  point  (the  EZGCD  algorithm  always  started  at  0) . 
Zippel's  thesis  shows  that  the  overhead  for  his  probabilistic  GCD  algorithm 
compares  favorably  with  other  approaches  and  that  indeed  it  solves  the 
exponential  problem  of  EZGCD.  It,  of  course,  compares  very  favorably  with  the 
older  approaches. 

Since  the  completion  of  the  thesis  in  September,  1979,  much  of  the 
work  has  gone  into  implementing  and  using  the  algorithms  in  MACSYMA.  In 
practical  uses,  the  probabilistic  aspect  of  the  algorithm  has  not  ever  been 
a  problem.  The  algorithm  can  check  when  it  runs  into  difficulty  (i.e., 
when  the  randomly  chosen  point  is  indeed  a  zero  of  the  derivative)  and  can 
try  another  point,  but  it  has  not  had  to  do  so  yet. 

We  have  also  used  similar  techniques  in  other  algebraic  problems. 

For  example,  R.  Gosper  had  a  set  of  three  lO^-degree  polynomials  to  solve. 
Because  Gosper  was  only  interested  in  rational  number  solutions,  Zippel  was 


can 


able  to  use  a  Hensel  method  and  obtained  hundreds  of  rational  solutions.  In 


my  opinion,  no  other  known  approach  to  the  problem  would  have  succeeded 
on  present  day  computers. 


Probabilistic  Algorithms 
for  Sparse  Polynomials 


by 

Richard  Eliot  Zippel 


I 


Contents 


Chapter  I  Introduction  . 4 

1.  Probabilistic  Algorithms  .  . . 6 

2.  Factoring  Polynomials  and  Computing  GCDs . 6 

3.  Determinants,  Resultants  and  Linear  Equations . 8 

4.  A  Roadmap  . 9 

Chapter  II  Classical  GCD  Algorithms . 11 

1.  Euclid’s  Algorithm . 12 

2.  Subresultant  PRS  . 14 

3.  The  Modular  GCD  algorithm . 15 

1.  Modular  Arithmetic . 15 

2.  The  Chinese  Remainder  Algorithm . 16 

3.  Detailed  Description  of  the  GCD  Algorithm . 19 

Chapter  III  Sparse  Modular  Algorithm  . 23 

1.  Overview  of  the  Sparse  Modular  Algorithm  . 24 

2.  General  Formulation  of  Sparse  Modular  Algorithm  . 26 

3.  An  Example  . 27 

4.  Analysis  . 30 

Chapter  IV  Some  Applications  . 34 

1.  A  Resultant  Calculation . 35 

2.  Recovering  Rational  Numbers  and  Rational  Functions  . 37 

3.  Determinants . 38 

Chapter  V  Idcal-adic  Arithmetic . 40 

1.  Newton’s  Iteration . 40 

2.  p-adic  Numbers  . 41 

2 


3.  m-adic  Arithmetic  . 

Chapter  VI  Hensel’s  Lemma  . 

1.  A  New  Formulation  of  Hcnscl’s  Lemma  .  . 

1.  Reducing  Problems  to  Solving  Equations 

2.  Solving  the  Systems  of  Equations  .  .  . 

2.  Proof  of  Hcnsel’s  Lemma . 

3.  Zassenhaus’  Formulation  of  Henscl’s  Lemma 

Chapter  VII  The  Sparse  Ilcnsel  Algorithm  . 

1.  Introduction  of  Sparse  Techniques  . 

2.  Some  Ideas  for  Implementations  . 

3.  Analysis  of  the  Sparse  Ilcnsel  Algorithm  .  . 

4.  Timings . 

Appendix  I  Polynomials  Used  in  the  Timing  Tests  .  . 


References 


Chapter  I 


Introduction 


Scientists  and  engineers  have  used  algebraic  manipulation  systems  with  significant  success  in 
many  computational  problems.  There  are  numerous  symbolic  computations  in  general  relativity, 
high  energy  physics  and  celestial  mechanics  that  have  been  successfully  completed  using  algebraic 
manipulation  systems;  yet,  they  would  have  taken  years  to  do  by  hand. 

Unfortunately,  algebraic  systems  do  have  their  drawbacks.  Because  they  deal  largely  with 
exact  quantities  and  not  approximations  like  floating  point  numbers,  the  expressions  that  are 
dealt  with  become  larger  with  almost  every  arithmetic  operation.  Consider  the  following  sum: 
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.8225. 


Wilh  the  terms  given,  only  the  first  two  digits  will  give  a  meaningful  approximations  to 
Using  floating  point  arithmetic,  we  might  compute  with  four  digits  This  would  give  .8180.  But 
if  the  sum  is  computed  with  exact  rational  numbers  we  would  have  • 

This  illustrates  the  fundamental  tradeoff  between  precision  and  the  size  of  the  expressions 
used  in  a  computation.  Because  algebraic  manipulation  systems  insist  on  exactness,  they  are 
always  faced  with  large  expression  growth,  some  of  which  may  be  unnecessary.  The  unnecessary 
growth  can  often  be  fatal.  For  a  number  of  problems,  of  which  greatest  common  divisor 
calculation  is  the  best  known,  the  straightforward  approach  leads  to  intermediate  expressions 
that  arc  much  larger  than  the  final  answer.  Behavior  of  this  sort  is  called  intermediate  expression 
swell. 

The  intermediate  expression  swell  problem  is  most  prominent  when  the  problem  to  be  solved 
involves  a  large  number  of  variables.  Assume  a  polynomial,  P,  involves  v  variables  and  each 
appears  wilh  degree  at  most  d.  We  will  call  P  dense  if  nearly  all  possible  monomials  arc  present; 
that  is,  P  contains  almost  (d  -{-  I)'  terms.  If  only  a  small  portion  of  the  number  of  possible  terms 
is  present  then  P  is  said  to  be  sparse.  Assume  the  expressions  with  which  a  computation  begins 
arc  sparse.  After  a  number  of  arithmetic  operations  the  intermediate  expressions  will  become 
more,  dense.  This  is  known  as  fill  in.  As  the  steps  of  the  computation  proceed,  the  operations 
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with  the  intermediate  expressions  will  become  more  expensive,  if  the  final  answer  is  relatively 
small,  possibly  because  it  is  sparse  for  some  deep  reason,  then  the  intermediate  computations 
can  require  exponential  time  in  the  size  of  the  answer. 

Over  the  past  fifteen  years,  two  fundamental  approaches  have  been  advanced  to  help  contain 
intermediate  expression  swell  in  certain  cases.  In  the  late  sixties,  Brown  and  Collins  and  his 
students  investigated  algorithms  based  on  modular  arithmetic  and  modular  homomorphisms 
(Bro71j.  The  resulting  algorithms  did  not  use  large  amounts  of  storage,  but  for  multivariate 
problems  with  small,  sparse  answers  the  modular  algorithms  could  require  exceedingly  large 
amounts  of  time. 

Zasscnhaus  [Zas69],  Wang  and  Rothschild  (Wan75,  Wan78],  Musser  [Mus75],  Moses  and 
Yun  |Mos73,  Yun74,  Yu»76]  have  investigated  an  alternate  approach  that  makes  use  of  a 
computational  formulation  of  Hcnsci’s  lemma.  This  approach  not  only  minimizes  the  storage 
required  in  the  intermediate  computations,  but  also  runs  in  polynomial  time  in  the  size  of  the 
answer  for  most  problems.  Unfortunately  there  exist  problems  that  require  both  exponential 
time  and  space  when  this  method  is  used.  Thus  in  the  worst  case,  both  the  modular  and 
Henscl-based  algorithms  require  exponential  time  to  compute  the  answer. 

The  main  accomplishment  of  this  thesis  is  to  introduce  a  new  probabilistic  idea  that  allows 
us  to  modify  both  the  modular  and  llenscl  approaches  in  a  manner  that  greatly  improves  the 
behavior  of  these  algorithms.  These  modified  algorithms  use  an  internal  randomization  process, 
so  the  amount  of  time  they  require  on  any  particular  problem  may  vary  from  trial  to  trial. 
However,  their  average  running  time  for  any  particular  problem  is  polynomial.  We  say  that 
these  algorithms  run  in  probabilistic  polynomial  time. 

The  two  classical  approaches  to  problems  in  algebraic  manipulation  utilize  some  of  the 
structure  inherent  in  multivariate  polynomial  problems.  By  replacing  some  of  the  variables  by 
integers,  the  multivariate  problem  is  reduced  to  a  univariate  problem  that  is  much  easier  to  solve. 
It  is  then  observed  that  the  univariate  solution  is  very  closely  related  to  the  original  multivariate 
solution.  For  many  problems  the  univariate  solution  is  the  image  of  the  multivariate  solution 
with  the  variables  replaced.  This  univariate  solution  is  then  lifted  to  a  bivariate  solution,  and 
then  to  a  trivariatc  solution  and  so  on.  When  all  the  variables  have  been  recovered  we  should 
be  left  with  the  desired  multivariate  solution. 

This  version  of  the  modular  algorithm  docs  not  take  advantage  of  any  information  about 
the  “shape”  of  the  polynomials  that  might  be  determined  at  the  stages  of  the  lifting  process.  The 
llenscl  algorithm  is  generally  implemented  in  a  manner  that  goes  from  a  univariate  solution  to 
the  full  multivariate  solution  in  one  step.  Though  this  is  very  efficient  in  many  cases,  there  are 
classes  of  problems  for  which  the  basic  technique  breaks  down.  (Basically  because  the  Jacobian 
of  a  system  of  equations  closely  related  the  problem  becomes  zero.  This  is  called  the  bad  zero 
problem.)  In  this  ca.se  the  problem  can  be  modified  to  avoid  the  bad  zero  problem,  but  only 
by  making  the  modified  answer  much  denser  in  most  cases. 

The  fundamental  idea  contained  in  this  thesis  is  best  expressed  in  terms  of  an  interpolation 
algorithm.  Assume  that  we  are  interpolating  a  sequence  of  values  of  a  sparse  bivariate  polynomial, 
P(X,  Y).  This  is  normally  done  by  determining  a  number  univariate  polynomials,  P( X,  y,),  and 
interpolating  their  coefficients  to  compute  P(X,  Y).  Our  algorithm  computes  the  first  univariate 
polynomial  by  the  usual  interpolation  scheme,  but  then  assumes  the  other  univariate  polynomials 
have  the  same  structure.  For  instance,  if  the  first  univariate  polynomial  is  X'  -f~  1,  then  we 
assume  that  the  other  univariate  polynomials  have  only  an  X  1  and  constant  term  and  that 
the  coefficients  of  the  X1,  X  \  X1  and  X  terms  arc  zero.  If  this  assumption  is  correct  then 
we  will  only  need  to  lift  the  coefficients  of  X  ’  and  of  the  constant  term,  thus  decreasing  the 
number  of  terms  that  need  to  be  computed  by  G7%.  An  assumption  of  this  sort  is  utilized  as 
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cadi  variable  is  introduced  into  the  answer.  This  exponentially  decreases  the  number  of  terms 
that  are  computed  and  gives  us  an  algorithm  that  is  polynomial  in  the  size  of  the  answer. 


1.  Probabilistic  Algorithms 

Probabilistic  techniques  have  been  applied  to  difficult  problems  before  By  difficult  we  mean 
problems  whose  best  known  algorithm  requires  exponential  time.  The  NJMiard  problems  [Car79] 
certainly  fall  into  this  class,  as  do  the  problems  of  polynomial  factorization  and  computation 
of  greatest  common  divisors.  Weidc  [Wei78]  divided  probabilistic  algorithms  into  three  classes 
in  his  thesis.  In  the  first  class  we  have  those  algorithms  that  give  an  approximate  answer  to 
all  problems  (usually  in  polynomial  time).  The  graph  algorithms  of  Graham  [Gra66]  and  Karp 
[Kar76|  fall  into  this  class.  The  primality  testing  algorithms  of  Solovay  and  Strasscn  [Sol77] 
and  Rabin  [Rab76]  give  an  exact  answer  most  of  the  time.  Algorithms  in  this  class  can  often 
be  modified  to  always  return  an  exact  answer,  but  then  will  occasionally  require  exponential 
time.  The  final  class  of  algorithms  yields  a  good  approximate  answer  most  of  the  time. 

The  algorithms  presented  in  this  thesis,  and  those  derived  from  the  ideas  given  here  will 
fall  into  the  second  class  if  it  is  possible  to  verify  that  an  answer  is  correct  Both  the  GCD 
and  polynomial  factorization  problems  lie  in  this  class.  Some  problems,  like  the  computation  of 
determinants  and  resultants,  have  solutions  that  are  difficult  to  verify.  Our  algorithms  for  these 
problems  fall  into  a  subcategory  of  Wcidc’s  third  class.  These  algorithms  return  the  correct 
answer  most  of  the  time,  but  there  is  a  chance  the  answer  will  be  incorrect. 

The  algorithms  presented  in  this  thesis  arc  probably  best  viewed  as  algorithms  that  produce 
candidate  solutions  which  have  a  probability  t  of  being  incorrect.  It  is  possible  to  compute  the 
candidate  solutions  in  polynomial  time  both  in  terms  e  and  the  size  of  the  candidate.  It  is  possible 
to  make  t  as  small  as  desired.  In  practice,  c  can  be  small  enough  that  only  one  candidate  need 
be  computed. 

2.  Factoring  Polynomials  and  Computing  GCDs 

Factoring  integers  has  been  a  favorite  pastime  of  human  and  electronic  computers  for  some 
time  With  the  advent  of  symbolic  algebra  systems  in  the  sixties,  it  was  not  surprising  that 
interest  would  arise  in  factoring  polynomials.  The  ability  to  factor  polynomials  has  proven  to 
be  an  exceedingly  useful  tool  in  other  problems.  Of  course,  this  ability  can  make  solving  a 
polynomial  equation  much  easier  and  thus  lies  at  the  heart  of  most  packages  to  solve  equations 
and  systems  of  equations.  Factoring  polynomials  also  is  part  of  most  algorithms  for  computing 
integrals  of  rational  and  algebraic  functions. 

Algebraic  number  theory  computations  abound  with  factoring  problems.  The  determination 
of  the  structure  of  Galois  groups,  the  computation  of  the  degree  of  an  algebraic  number  over  a 
field  and  the  investigations  of  class  groups  arc  just  a  few  of  the  problems  that  require  factoring 
of  polynomials. 

Many  users  of  algebraic  manipulation  systems  have  discovered  that  factoring  programs 
can  be  among  their  most  powerful  simplifiers.  The  factorization  of  a  result  will  often  yield  a 
more  succinct  expression  and  more  insight  into  its  physical  interpretation  than  the  expanded 
representation  that  many  algorithms  return. 

The  first  algorithm  for  determining  the  factors  of  a  polynomial  over  the  integers  was  invented 
by  von  Schubert  in  1 79. 'f  and  rediscovered  by  Kronecker  in  the  late  nineteenth  century.  This 
algorithm  is  now  probably  best  known  for  its  highly  exponential  behavior.  The  truly  practical 
algorithms  that  have  been  developed  in  the  seventies  arc  based  on  Bcrlekamp’s  algorithm  [lierCO] 


for  factoring  univariate  polynomials  over  a  finite  field  and  Zasscnliaus’  version  of  llensel's  lemma 
jZastil)]  for  lifting  a  factorization  over  a  finite  field  to  one  over  (lie  integers  and  then  to  a 
multivariate  factorization. 

As  was  mentioned  earlier,  there  are  certain  classes  of  problems  for  which  the  flenscl  approach 
takes  an  extremely  long  time.  These  problems  are  generically  called  “bad  zero”  problems. 
Characteristically  these  problems  require  time  that  is  exponential  in  the  number  of  variables. 

Recently,  Paul  Wang  jWan78]  introduced  some  ideas  on  how  to  reduce  the  impact  of  the 
bad  zero  problems  on  algorithms  using  llensel’s  lemma.  The  heuristics  he  uses  yield  a  very 
significant  reduction  of  the  exponential  cost  of  some  bad  zero  type  problems,  but  they  do 
not  completely  eliminate  the  exponential  behavior.  The  precise  dependence  of  the  exponential 
behavior  of  Wang’s  algorithms  has  not  yet  been  determined,  due  in  part  to  the  fact  that  they 
are  exceedingly  complex  algorithms. 

The  techniques  presented  in  this  thesis  are  a  direct  result  of  Wang's  ideas  when  applied  to  our 
formulation  of  llensel’s  lemma.  A  significant  advantage  of  onr  framework  for  the  Ilenscl  lemma 
is  that  Wang’s  ideas,  when  incorporated,  yield  a  demonstrably  polynomial  time  algorithm. 
Actually,  the  work  for  this  thesis  began  bv  trying  to  show  that  Wang’s  ideas  could  be  applied 
to  the  formulation  of  llensel’s  lemma  presented  here. 


Though  probably  not  as  exciting  a  problem  as  factoring  polynomials,  computing  GCD’s 
is  certainly  at  least  as  important.  GCD’s  are  implicitly  used  in  computations  to  keep  rational 
functions  reduced  to  lowest  terms.  Unlike  the  situation  with  factoring  polynomials  where  there 
is  essentially  only  one  realistic  approach,  there  are  four  approaches  that  have  been  implemented 
and  are  used  in  algebraic  manipulation  systems.  Each  works  quite  well  lor  some  class  of  problems. 
The  siibresultant  GCI)  algorithm  [Col67]  is  rather  difficult  to  analyze,  but  il  seems  to  perform 
well  when  the  GCI)  is  a  large  factor  of  the  original  polynomials.  When  the  two  polynomials 
have  a  GCI)  of  1,  the  siibresultant  algorithm  will  often  require  unreasonably  large  amounts 
of  time  and  space.  Recently  (learn  |llea7!)j  introduced  a  new  algorithm  along  the  same  lines 
as  the  siibresultant  algorithm,  which  seems  to  behave  somewhat  Getter  than  the  siibresultant 
algorithm  but  still  exhibits  exponential  behavior.  The  modular  algorithm  [Bro71]  docs  not 
require  a  large  amount  of  space  but  it  still  runs  in  exponential  time  in  terms  of  the  number 
of  variables.  This  is  reasonable  for  completely  dense  polynomials.  The  Ilenscl  version  of  the 
GCD  algorithm  [Mos7d],  which  is  called  the  EZGCI)  algorithm,  has  the  same  problem  with 
bad  zeroes  that  plagues  the  Ilenscl  factorization  algorithms. 

Our  improvement  to  the  Ilenscl  factoring  algorithm  also  yields  a  polynomial  time  algorithm 
for  GCI)  computat  ions.  However,  we  have  also  been  able  to  apply  our  techniques  to  the  modular 
algorithm  to  produce  a  probabilistic  procedure  that  computes  polynomial  GCDs  in  time  that 
is  polynomial  in  the  size  of  the  answer.  This  new  modular  algorithm,  which  was  discovered 
in  discussions  with  Barry  Tragcr,  is  actually  significantly  simpler  than  the  Ilenscl  algorithm 
and  illustrates  the  key  ideas  in  our  approach  more  dearly.  Thus  we  discuss  the  sparse  modular 
algorithm  lirst,  even  though  our  version  of  the  Ilenscl  algorithm  was  discovered  first. 

The  following  table  lists  the  times  required  by  a  number  of  algorithms  to  compute  the  GCD 
of  tfic  second  set  of  polynomials  listed  in  the  appendix.  It  is  intended  (o  give  an  indication  of  the 
possible  performance  of  the  algorit  hms  presented  in  this  thesis.  In  section  VII. \  we  discuss  this 
example  more  fully.  The  times  are  listed  in  seconds  and  the  asterisks  indicate  that  MACSYMA 
ran  out  of  storage. 
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3.  Determinants,  Resultants  and  Linear  Equations 

The  computation  of  determinants  of  large  sparse  matrices  is  one  of  the  more  difficult 
problems  in  computing.  Using  numerical  methods,  “large”  means  matrices  with  thousands  of 
rows  and  columns.  In  algebraic  manipulation  a  matrix  with  10  to  20  rows  is  quite  large.  This 
is  one  of  the  problems  that  quite  graphically  points  out  the  difference  between  numerical  and 
symbolic  calculations.  The  determinant  of  a  general  20  X  20  symbolic  matrix  contains  20! 

2.5  X  101M  terms  far  more  than  can  be  handled  by  any  existing  computer  or  even  any  being 
designed  now.  On  the  other  hand,  the  determinant  of  any  numerical  matrix  is  a  single  number. 

One  result  that  points  to  the  difficulty  in  computing  determinants  has  been  recently 
discovered  by  Papadimitriou  [I’ap79].  lie  has  shown  that  for  certain  n  x  n  matrices  involving 
\/ri  variables,  the  computation  of  the  coefficient  of  a  single  monomial  in  the  determinant  is 
NP-hard. 

If  the  only  problem  being  considered  was  computing  the  entire  determinant  then  determinant 
calculations  would  be  quite  uninteresting.  On  occasion,  however,  something  is  known  about  the 
structure  of  the  determinant  beforehand.  It  would  he  very  useful  if  some  use  can  be  made  of 
this  information  to  speed  the  computation. 

Commonly,  in  the  course  of  a  problem,  a  system  of  equations  is  constructed.  Of  all  the 
variables,  only  one  or  two  may  actually  be  interesting.  All  the  others  were  merely  introduced 
to  set  up  the  system  of  equations  Using  numerical  methods,  the  computation  of  the  values  of 
these  auxiliary  variables  is  not  really  a  devastating  problem.  Though  their  computation  may 
require  additional  time,  the  space  required  is  only  linear  in  their  number. 

In  symbolic  problems  each  of  the  auxiliary  solutions  can  be  very  large.  If  there  is  any 
way  to  eliminate  the  computation  of  the  auxiliary  variables  it  will  be  very  worthwhile.  Until 
now,  the  only  technique  available  for  avoiding  these  auxiliary  computations  was  the  modular 
interpolation  algorithm.  But  this  algorithm  does  not  lake  advantage  of  any  sparsity  in  the 
answer.  In  a  later  section  of  this  thesis  we  will  discuss  the  first  algorithms  for  solving  these  sorts 
of  problems  that  do  take  advantage  of  the  sparsity  of  the  answer. 

The  computat  ion  of  polynomial  resultants  and  inversion  of  a  mat  rix  containing  polynomial 
entries  can  both  be  considered  to  be  special  cases  of  determinant  calculations.  They  also  seem 
to  have  the  same  problems  that  plague  determinants.  It  is  very  difficult  to  check  an  answer  with 
all  of  these  problems.  Thus  the.  algorithms  we  present  can  possibly  give  an  erroneous  result,  but 
the  probability  that,  an  answer  is  erroneous  can  be  set  arbitrarily  low  before  the.  computation 
begins. 


8 


¥ 


4.  A  Roadmap 


This  thesis  is  divided  into  two  essentially  independent  pieces.  Part  I  is  devoted  to  the 
study  of  the  modular  algorithm,  its  derivatives  and  applications.  Part  II  is  concerned  with  the 
algorithms  based  on  Henscl’s  lemma.  In  each  portion  the  core  of  the  results  is  contained  in 
chapters  that  discuss  the  use  of  sparsity  in  each  of  these  approaches. 

The  main  example  that  is  used  to  illustrate  both  the  modular  and  Ilensel  algorithms  is  the 
computation  of  the  GCD  of  two  polynomials.  This  algorithm  is  used  frequently  in  algebraic 
manipulation  and  is  implicit  in  virtually  all  computations  involving  rational  functions.  In 
fact,  during  the  sixties  before  Collins’  and  Brown’s  |nro71]  work  on  GCD  computations,  the 
dominant  cost  in  rational  function  operations  was  the  GCD  calculations  used  to  ensure  that 
rational  functions  were  reduced  to  lowest  terms. 

Because  of  their  importance  we  begin  Part  I  with  a  review  of  the  what  arc  now  the  “classic” 
GCD  algorithms.  This  review,  which  is  contained  in  chapter  II,  begins  with  a  discussion  of 
the  principles  underlying  Euclid’s  algorithm  for  polynomials  and  the  various  improvements 
of  it,  which  culminated  in  Collins’  subrcsiiltant  algorithm  [Col7lj.  The  chapter  concludes  its 
survey  of  the  algorithms  of  the  sixties  with  Collins’  and  Brown’s  version  of  the  modular  GCD 
algorithm. 

In  chapter  III  wc  present  our  sparse  version  of  the  modular  algorithm.  This  chapter  presents 
the  fundamental  idea  advanced  in  this  thesis.  In  chapter  III  we  also  give  a  relatively  detailed 
analysis  of  the  behavior  of  the  probabilistic  sparse  modular  algorithm. 

In  chapter  IV  we  apply  the  ideas  of  the  previous  chapters  to  a  number  of  problems.  Of 
particular  interest  is  the  sparse  resultant  algorithm.  This  is  the  first  algorithm  of  any  form  that 
computes  the  resultant  of  two  polynomials  in  time  that  is  polynomial  in  the  size  of  the  answer. 
We  also  indicate  how  the  sparse  ideas  may  be  applied  to  more  involved  and  complex  problems 
in  algebraic  manipulation. 

In  the  second  portion  of  this  thesis  we  illustrate  how  the  basic  idea  of  a  sparse  lifting  can 
be  applied  to  Ilenscl’s  lemma.  Before  this  can  be  done  we  need  to  cast  Henscl’s  lemma  in  a 
framework  that  is  somewhat  difTercnt  from  that  presented  by  other  researchers  in  algebraic 
manipulation.  To  fix  the  notation  and  make  this  a  bit  more  self-contained,  the  basic  ideas  that 
are  taken  from  mathematics  are  presented  in  chapter  V.  Those  who  arc  familiar  with  the  results 
of  valuation  theory  can  skip  this  chapter. 

Chapter  VI  gives  a  detailed  discussion  of  IlcnsePs  lemma.  Wc  present  our  “new”  framework 
and  trace  its  development  from  Newton's  work  through  Henscl’s.  Wc  then  present  the  “old” 
version  that  was  popularized  by  Zassenhaus  [Zas69],  Wang  and  Rothschild  |Wan7fi],  Musscr 
[Mus75]  and  Moses  and  Yun  [Mos73,  Yun75,  Yun  76],  This  version  is  shown  to  be  equivalent  to 
our  version.  The  results  of  chapters  V  and  VI  are  quite  old  and  wc  make  no  claim  to  originality 
here. 

Chapter  VII  shows  how  the  ideas  on  sparsity  that  were  applied  to  the  modular  algorithm 
in  chapter  III  can  be  applied  to  Mcnscl’s  lemma.  Wc  also  show  how  a  trick  due  to  Wang  may 
be  used  to  increase  the  speed  of  the  algorithm  even  further.  The  final  section  of  this  chapter 
gives  an  analysis  of  the  sparse  Ilensel  algorithm.  The  Ilensel  algorithm,  when  applied  to  the 
computation  of  polynomial  GCDs,  can  be  analyzed  in  much  the  same  manner  as  the  sparse 
modular  algorithm. 

Finally,  chapter  VIII  presents  our  conclusions  and  some  points  that  deserve  further  study. 

Throughout  this  thesis  the  following  conventions  are  observed  with  respect  to  cross  refer¬ 
ences.  Any  reference  to  a  section  in  different  chapter  will  be  of  the  form  V.2.3.  The  other 
chapter’s  number  is  given  as  a  capital  Roman  numeral  and  is  followed  by  the  section  number 


and  then  the  subsection  number.  References  to  sections  in  the  same  chapter  omit  the  chapter 
number.  Bibliographical  references  consist  of  the  first  three  letters  of  the  first  author’s  last  name 
and  the  last  two  digits  of  the  year  of  publication. 


Chapter  II 


Classical  GCD  Algorithms 


All  practical  algorithms  for  computing  the  greatest  common  divisor  of  two  polynomials  are 
based  on  Euclid’s  algorithm  (perhaps  the  first  usage  of  Fermat’s  principle  of  infinite  descent 
|IIar68]).  Assume  we  arc  given  two  positive  integers  pi  >  p±  whose  greatest  common  divisor 
(GCD)  we  wish  to  calculate.  We  can  determine  two  integers,  q\  and  pj,  that  satisfy 

Pi  =  q\Pi  +  PJ  P2  >  P3  >  0 

by  integer  division.  It  is  clear  that  any  integer  that  divides pi  and  p>  also  divides  pi.  Furthermore, 
any  integer  that  divides  pi  and  pi  must  divide  p\  so  the  GCD  of  pi  and  pj  is  also  the  GCD  of 
Pi  and  p^.  Assume  p,j  7^  0.  We  now  compute  a  new  quotient  and  remainder 


Pi  =  (fiP.\  +Pa  Pj  >  P4  >  0. 

This  can  be  continued  to  give  a  remainder  sequence 

Pi  >  Pi  >  Pi  >  ■ ' '  >  Pn  >  Pn+l  =  0, 
thus  p„  must  be  the  GCD  of  pi  and  pi. 

In  this  chapter  we  will  examine  how  this  algorithm  may  be  extended  to  compute  the  GCD 
of  two  polynomials.  The  remainder  sequence  or  Euclid’s  algorithm  then  becomes  a  sequence 
of  polynomials  called  a  polynomial  remainder  sequence,  or  PRS.  Again  the  GCD  will  be  the 
last  non-zero  term  in  the  sequence.  For  polynomials,  the  term  before  the  GCD,  p„— 1,  may  be 
much  larger  than  necessary.  The  main  purpose  of  the  algorithms  in  the  first  two  sections  of 
this  chapter  is  to  reduce  the  size  of  the  elements  of  the  PRS. 

In  the  final  section,  we  discuss  the  modular  GCD  algorithm  due  to  Brown  and  Collins 
(Bro7l).  This  algorithm  is  the  first  of  the  GCD  algorithms  that  deviates  radically  from  the 
classical  approach. 
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1.  Euclid's  Algorithm 
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We  will  use  the  following  conventions  and  terminology  throughout  this  thesis.  Capital 
letters  will  be  used  to  represent  symbolic  literals.  Lower  case  letters  will  be  used  to  represent 
unknowns,  integers  and,  on  occasion,  other  expressions.  Let 

F(X)  =  f0X"  +  fiXn-'  +■••+/„ 

be  a  polynomial  in  X.  The  degree  of  F  is  n.  This  is  denoted  by  dcgF  =  n.  If  J[>  =  1  then  F  is  a 
mon/c  polynomial.  The  greatest  common  divisor  of  /),.  ..  is  the  content  of  F.  If  the  content 
of  a  polynomial  is  I,  then  it  is  said  to  be  primitive.  Given  the  polynomial  F  we  can  compute 
its  content  by  computing  the  GCD  of  fa  and  f,  call  it  gw  then  computing  the  GCD  of  gt  and 
fi  and  repeating.  The  content  will  then  be  gn.  The  primitive  part  of  F  is  f(X)/g„,  which  is  a 
primitive  polynomial. 

Lei  F\{X)  and  F2(X)  be  polynomials  over  a  field.  F|(X)  can  be  divided  by  F2(X)  to  obtain 
a  quotient  <7i(X)  and  a  remainder  Fj(X)  that  satisfy 

F, (X)  =  q,(X)-  F2(X)  +  Fj(X)  deg(/j)  <  deg(/2).  (1) 

Just  as  with  integers,  the  GCD  of  F2(X)  and  F3(X)  must  also  be  the  GCD  of  F|(X)  and  F2(X). 
This  process  can  be  repeated  with  F2(X)  and  F)(X)  to  obtain  a  quotient  <?2(X)  and  a  remainder 
Fa(X). 

F2(X)  =  92(X)  f3{X)+f4{X) 


Fn-j(X)  =  qn—j(X)  •  F„_2(X)  +  F„_)(X) 

Fu_2(X)  =  <7„_2(X)  •  F„_,(X)  +  Fn(X) 

The  degrees  of  the  F,(X)  are  decreasing,  hence,  Tor  some  i,  the  degree  of  F,(X)  must  be  zero. 
If  F,(X)  is  non-zero  then  the  next  remainder  in  the  sequence,  F,.|  i(X),  is  zero.  Without  loss 
of  generality  we  may  assume  that  F„.y  |(X)  =  0.  The  GCD  of  F|(X)  and  F2(X)  will  be  F„(X). 
Notice  that  the  degree  of  the  polynomial  is  being  used  in  place  of  the  magnitude  metric  used 
for  integers. 

As  an  example  consider  calculating  the  greatest  common  divisor  of 

F|  (X)  =  X8  +  X°  -  3X 4  -  3X3  +  8X2  -f  2 X  -  5  and 

F2(X)  =  3X°  -f  5X1  -  4XZ  -  9X  +  21. 

(This  is  the  traditional  example  used  to  illustrate  polynomial  GCD  algorithms  first  used  by 
Knuth  [Kiiu69].)  Using  the  Euclidean  algorithm  we  obtain  the  following  sequence  of  polynomials. 
We  write  only  the  coefficients  of  F,(X). 

F,=  1,0,  1,0, -3, -3, 8, 2, -5 
F2  =  3, 0, 5, 0,  — 4,  -9,  21 

F.i  =  —  >  0,  ,  0,  —  ^ 

w.-w 

'  ft  - 
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Since  the  last  non-zero  term  in  the  polynomial  remainder  sequence  is  a  constant,  the  GCD  of 
these  two  polynomials  is  1. 

Intermediate  expression  swell  usually  becomes  more  severe  as  the  number  of  variables 
increases.  In  the  following  PRS  we  have  used  bivariate  polynomials  of  smaller  degree  than  the 
previous  example,  yet  the  polynomials  that  result  are  still  quite  large. 

X4  +  X3  -  W 
X3  +  2X2  -f  3 WX  —  W  -f  1 

(-3W  +  2)X2  +  (4 W  -  1)X  -  2W  +  1 

27U-3— 2W*— IHV+3  v  I  W2-MW'-H 

91V2- I21V-H  A  '  9H2— I2IV-M 

7  — 73811  ® — 17  III  1+725\t  '  t-SIIV'— I02H  2+08M’—  8 
Oil  10811  r>‘»0ir  1 -f-‘2(Xill  '  +  I00H  2— OGIV-f  9 

Had  we  used  three,  four  or  more  variables  the  elements  of  the  PUS  would  have  filled  many 
pages. 

In  practice  it  is  not  reasonable  to  use  rational  coefficients  in  the  PRS.  The  number  of 
GCD’s  required  to  keep  these  coefficients  reduced  to  lowest  terms  is  just  too  great,  and  not 
reducing  them  leads  to  horrendous  expression  growth.  If  we  relax  (1)  slightly  we  can  obtain 
a  “pseudo-quotient"  and  “pseudo-remainder”  that  will  always  have  integral  coefficients.  Let 
F\(X)  and  F2(X)  be  primitive  polynomials  over  a  ring  R  of  degrees  n i  and  n2  respectively. 
(A  polynomial  is  primitive  if  its  coefficients  are  pairwise  relatively  prime.)  Denote  the  leading 
coefficient  of  F,( X)  by  /,.  The  pseudo-remainder  r(X)  and  pseudo-quotient  q( X)  satisfy 

/?-•»+' F,(X)  =  Fl(X)  •  q(X)  +  r(X).  deg(r)  <  ^ 


Fi  = 
F2  = 

f3  = 
f4  = 

F--,  = 


—72911 


72! 


Using  r(X)  as  Fj(X),  the  next  term  in  the  PRS  leads  to  rather  severe  expression  swell  as  we 

SC6* 

Fi  —  1,0, 1,0, -3, -3, 8, 2, -5 

F2  —  3, 0,5, 0,-4, -9, 21 

F3=  -15, 0,3, 0,-9 

F4  =  15795,30375,-59535 

F%  =  1254542875143750,-1654008338437500 

Fi  =  12593338795500743100931141992187500 

This  growth  is  clearly  unacceptable.  Most  of  these  polynomials  are  not  primitive.  If  we 
remove  their  content  at  each  step  we  get  the  following  PRS: 

F,=  1,0, 1,0,  —3,  — 3,8, — 5 
F2=  3, 0,5, 0,-4, -9,  21 

F3=  -5, 0,1, 0,-3 

F„=  13,25,-49 

F5  =  4663,-6150 

F0=  1 


This  is  known  as  the  primitive  PRS  and  the  algorithm  that  uses  it  is  called  the  primitive  GCD 
algorithm.  It  is  as  good  as  can  be  done  with  respect  to  the  growth  of  the  terms  in  the  PRS; 
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unfortunately  the  amount  of  time  required  to  ensure  that  each  term  in  the  PRS  is  primitive  is 
exceedingly  great. 

The  terms  of  a  PRS  always  satisfy  the  following  relationship 

'F.(A')  =  Fl+i(X)  ■  q(X)  +  ftFI+2( X). 

For  the  Euclidean  PRS,  /3,  =  1.  We  will  denote  the  pseudo-remainder  of  F,(x)  and  F,  +  |(x) 
by  prem(F,,  F,_|_i)  and  the  content  of  F,  by  cont(F,).  Then  the  primitive  PRS  uses  /?,  = 
cont(prcm(Fi,Fi-|-i)). 

To  avoid  the  GCD’s  needed  by  the  primitive  PRS,  Collins  introduced  the  reduced  PRS 
[Col67,  Bro71].  This  algorithm  uses  1  for  /3\  and  for  /?,.  For  the  same  problem,  the 

coefficients  of  the  reduced  PRS  are 

F,  =  1,0, 1,0, -3, -3, 8, 2, -5 
F2=  3, 0,5, 0,-4, -9,  21 

Fj  =  -15, 0,3,0, -9 

F4  =  585, 1125,-2205 

F5  =  -1885150,24907500 

F0  =  527933700 

Through  better  than  the  Euclidean  PRS,  the  coefficients  of  the  reduced  PRS  can  still  grow 
exponentially.  In  particular,  if  the  two  polynomials  arc  relatively  prime,  the  cost  involved  in 
the  PItS  can  be  prohibitive.  This  is  probably  the  most  common  case  for  the  GCD’s  involved 
in  rational  function  computations. 

2.  Subresultant  PRS 

The  best  of  the  GCD  algorithms  that  use  a  full  polynomial  remainder  sequence  is  Collins’ 
subrcsultant  algorithm.  For  sufficiently  sparse  polynomials,  Brown’s  analysis  [Bro78]  indicates 
that  even  the  subresultant  algorithm  will  require  exponential  time  to  compute  the  GCD.  Again 
the  way  of  decreasing  the  growth  of  the  terms  in  the  PRS  is  to  pick  a  better  value  for  /?,.  The 
proof  that  the  /9,  chosen  actually  leads  to  a  valid  PRS  is  not  easy.  The  interested  reader  should 
examine  |Bro78],  which  is  the  definitive  reference  for  the  subresultant  algorithm. 

Let  6i  —  n,  —  n/_pj.  For  the  subrcsultant  PRS,  /?,  can  be  chosen  as  follows 

A  =  (-i)6,+1 

A  =  (-l)*-»+l/<_afc*r|.  i  =  4, . . . ,  k  + 1; 

where 

i  —  3 . k. 

The  subresultant  PRS  for  our  standard  example  is 

F\  —  1,0, 1,0, -3, -3, 8, 2, -5 
F2=  3, 0,5, 0,-4, -9,  21 

F,=  15,0, -3,0,9 

F„  =  65, 125,  -245 

F5  .=  9326,-12300 

F0  =  260708 
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If  5,  is  1  then  the  PRS  step  that  divides  F,  and  F,_ (.|  to  produce  F,_| -2  is  called  normal. 
Otherwise,  the  step  is  said  to  be  abnormal.  At  each  normal  step  of  a  PRS  the  size  of  the 
coefficients  tends  to  grow  linearly.  Abnormal  steps  lead  to  faster  growth. 

The  subresultant  PRS  algorithm  docs  an  admirable  job  of  minimizing  intermediate  expres¬ 
sion  swell  in  the  computation  of  the  polynomial  remainder  sequence.  If  we  are  interested  in  the 
GCD  of  the  two  polynomials,  only  the  last  term  of  the  PRS  will  be  of  interest.  If  the  PRS  is 
relatively  short  then  it  will  not  have  a  chance  to  grow  too  much.  In  this  case  the  GCD  of  the 
two  polynomials  will  tend  to  be  a  large  factor  of  one  of  the  two  polynomials.  However,  if  the 
GCD  of  two  polynomials  is  small  the  PRS  involved  will  tend  to  rather  long.  In  this  case  the 
swell  involved  in  the  PRS  can  be  extremely  costly. 

3.  The  Modular  GCD  algorithm 

In  this  section  we  discuss  a  GCD  algorithm  that  docs  not  try  to  reduce  intermediate 
expression  swell  by  controlling  the  content  introduced  in  the  remainders  of  the  PRS.  Instead 
this  algorithm  maps  the  problem  into  a  univariate  polynomial  ring  over  a  field  in  which  GCDs 
can  be  computed  simply,  easily  and  without  intermediate  expression  swell  After  doing  a  number 
of  GCDs  in  that  ring,  it  is  possible  to  interpolate  the  results  to  compute  the  original,  polynomial 
answer.  Thus  there  is  no  need  to  compute  the  large  terms  of  the  PRS. 

This  idea  was  developed  by  Brown  and  Collins  [Bro71],  With  some  hints,  the  modular 
algorithm  was  independently  discovered  by  Knulh  [Knu69],  This  led  to  an  interesting  method 
of  controlling  the  intermediate  expression  swell  for  GCDs,  but  at  a  cost  in  the  time  required 
by  the  algorithm. 

Brown  and  Collins  observed  that  computing  univariate  GCD’s  over  a  finite  field  led  to  no 
swell  since  the  coefficients  could  not  grow  beyond  the  size  of  the  modulus.  Thus  they  convert 
a  multivariate  GCD  problem  to  a  number  of  univariate  GCD  problems  modulo  some  prime. 
The  answers  to  these  problems  are  then  interpolated  to  produce  the  true,  multivariate  GCD 
using  the  Chinese  remainder  algorithm  or  Newtonian  or  Lagrangian  interpolation. 

This  section  is  divided  into  two  parts.  In  the  first  we  discuss  the  Chinese  remainder  algorithm 
and  some  of  its  features.  This  is  the  key  to  doing  the  interpolation.  In  the  second  we  indicate 
how  it  is  applied  to  produce  the  modular  GCD  algorithm. 

3.1.  Modular  Arithmetic 

In  the  next  section  we  will  be  using  modular  arithmetic  rather  heavily.  In  this  section  we 
will  review  some  of  the  basic  principles  involved  when  performing  computations  with  modular 
arithmetic.  We  will  write 

x  ~  y  (mod  m) 

if  1  and  y  differ  by  a  multiple  of  m,  and  say  that  1  and  y  are  equivalent  modulo  m  if  this 
is  the  case.  It  is  conventional  to  pick  an  clement  from  each  equivalence  class  of  the.  rational 
integers  modulo  mand  do  the  arithmetic  with  these  representatives,  Two  sets  of  representatives 
arc  commonly  used.  On  most  current  computers  it  is  somewhat  easier  to  perform  modular 
arilhmetic  with  representatives  from  the  set  {  0,  I,  2, . . .  ,m—  I  }.  We  call  this  the  non-negative 
representation  of  the  integers  modulo  m  (Knu69|.  The  balanced  representation  of  the  integers 
modulo  m  uses  the  set 

{  \—(m  —  0/21, . . . ,  — -1,0, 1, . . . ,  f(m  —  l)/21 } 

(The  ceiling  operations  arc  needed  because  m  may  be  even.) 
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Regardless  of  the  representation  chosen,  any  integer  k  will  be  equivalent  to  only  one  element 
of  the  set  of  representatives.  Thus  representative  r  is  called  the  residue  of  k  modulo  m.  We  will 
also  say  that  k  is  congruent  to  r  modulo  m. 

Arithmetic  with  these  equivalence  classes  modulo  m  is  called  modular  arithmetic;  m  is  called 
the  modulus.  Modular  addition,  subtraction  and  multiplication  arc  all  performed  by  combining 
the  two  operands  using  integer  arithmetic  and  then  reducing  the  result  to  the  appropriate  range, 
usually  by  a  remainder  computation.  Raising  a  number  to  a  power  can  be  performed  by  repeated 
squaring  using  modular  multiplication.  This  is  especially  effective  since  the  numbers  used  are 
all  about  the  same  size  as  the  modulus. 

If  p  is  an  integer  modulo  m  we  can  sometimes  compute  1/p  modulo  m.  Denote  l/p  modulo 
m  by  u.  Then 

pus  1  {mod  m), 

and  pu  and  1  differ  by  a  multiple  of  m.  That  is 

pu  —  1  =  mv  or  pu  —  mv  =  1. 

If  p  and  m  have  a  common  divisor  d  >  1,  then  d  would  divide  pu  —  mv  and  would  thus  have  to 
divide  1.  Since  this  is  impossible  there  cannot  be  solutions  to  pu  —  mv  =  1  and  thus  p  cannot 
possess  an  inverse  modulo  m. 

If  p  and  m  arc  relatively  prime  then  it  is  not  too  hard  to  demonstrate  that  pu  —  mv  =  1 
must  have  solutions  To  see  this,  compute  the  GCD  of  p  and  m  using  the  Euclidean  algorithm. 
For  uniformity  we  will  let  fa  and  J\  denote  m  and  p  respectively.  We  will  also  assume  that  p 
is  in  the  non-negative  representation.  Then 

J»  ~  ?i/i  +  h 
fi  =  <71/2  +  h 


fe—2  —  <7f — ifi — 1  +  ft 
ft—  1  =  9r— 1  ft  +  1 

The  final  remainder  must  be  I  since  fa  and  f\  (p  and  m)  are  relatively  prime.  It  is  not  hard  to 
show  that  if 

Oi/o  =  bifi  +  A 

then 

ai-H  —  <7i—  lO«  +  Oi—  i 
fci+l  —  9.-l&i  +<>i— 1 

Since  =  1  wc  have  u  =  Thus,  if  p  and  m  are  relatively  prime,  p  has  an  inverse 
modulo  m,  and  in  which  case  p  is  said  to  be  a  unit.  More  generally  a  unit  is  some  clement  of  a 
ring  that  has  a  multiplicative  inverse.  Furthermore,  wc  have  developed  a  technique  for  solving 
the  equation 

af  —  bg  =  1 

for  integers  a  and  6  when  /  and  g  arc  relatively  prime.  This  equation  figures  prominently  in  the 
Henscl  algorithms  discussed  later. 

3.2.  The  Chinese  Remainder  A'gorithm 

Wc  now  begin  outlining  the  Chinese  remainder  algorithm  for  integers.  Let  Pc  ,  -  ■ ,  Pit  be 
relatively  prime  rational  integers.  Denote  the  product  of  the  p,  by  P.  Assume  we  know  that 
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some  integer  w  satisfies  the  system  of  equations 


w  —  rrii  (mod  Pi)  i  —  1, . . .  ,k. 


(2) 


We  do  not  assume  that  the  p,  are  primes.  However,  if  they  are  not  pairwise  relatively  prime  then 
the  system  of  equations  may  not  have  a  solution.  Note  that  this  is  when  P  is  not  square-free, 
i.e.  when  there  is  no  integer  Q  such  that  Q 1  divides  P.  Regardless  of  whether  P  is  square-free, 
if  there  is  a  solution  it  will  not  be  unique,  if  a  multiple  of  P  is  added  to  any  solution  of  (2)  then 
the  result  will  also  be  a  solution.  If  P  is  square-free,  then  there  all  the  solutions  of  (2)  will  lie 
in  exactly  one  equivalence  class  of  the  integers  modulo  P. 

The  Chinese  remainder  algorithm  provides  a  means  of  determining  an  integer  w  which 
satisfies  these  modular  equations,  provided  P  is  square-free,  if  there  arc  only  two  equations 
then  the  appropriate  value  for  w  is  apparent,  ui  ~  m\  pi(m,;  —  mi)p|~',  where  the  inverse 
of  pi  is  computed  modulo  pj  Thus  we  can  replace  (2)  by  the  slightly  smaller  system 


to  s  mi  -f-  pi (m^  —  m,)pf'  (mod  p,p;), 
w  ~  mj  (mod  P3), 


t u  ~  rrik  (mod  pit). 


(3) 


We  can  now  repeat  this  process  with  the  first  two  equations  of  (3)  since  pip2  and  pj  are  relatively 
prime.  The  following  specification,  which  follows  the  model  of  [Knu69],  makes  this  algorithm 
precise.  We  will  use  this  format  throughout  this  thesis. 

Algorithm  C  takes  as  input  a  set  of  pairwise  relatively  prime  rational  integers  pi, . . .  ,pk  and 
a  set  of  integers  m(, . . . ,  mk.  It  returns  an  integer  m  <  p\pi  ■  ■  pk  such  that  m  =  m,  (mod  p,) 
for  l  <  i  <  k. 

Cl.  [Initialize]  Set  q  pi,  m  *—  mi- 
C2.  [Loop]  For  i  =  2, . . . ,  k  do  step  C3. 

C3.  [Determine  new  m]  Set  cf  «—  (mod  p,),  m  <—  m  -f-  [m  —  ra,)  •  q  ■  q’  (mod  <7p,)  and 
q  —  qpi. 

C4.  [End]  Return  m. 

We  are  now  going  to  extend  this  algorithm  to  an  interpolation  algorithm  for  polynomials. 
In  order  to  do  this  we  will  rephrase  this  version  of  the  Chinese  remainder  algorithm  in  terms 
of  modern  algebra.  In  general,  the  problem  we  arc  to  solve  is:  Given  that  w  satisfies 


w  =  m\  (mod  p) 
u>  -  mi  (mod  q) 

where  p  and  q  are  relatively  prime,  we  are  to  find  m3  such  that 


w  =  m,j  (mod  pq). 


To  generalize  this  algorithm  we  will  say  that  p,  q,  m  1  mk  and  w  arc  now  all  polynomials. 
The  expression  w  =  (mod  p),  just  as  in  the  integral  case,  means  that  w  —  m\  is  divisible 
by  p.  If  p  is  a  linear  polynomial,  (X  —  at ),  then  the  representation  of  the  residue  classes  arc 
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rational  numbers.  We  can  compute  the  residue  of  f[X)  modulo  (X  —  a t)  by  division: 


f(X)  =  (X-at)q(X)  +  b. 

b  must  be  of  degree  less  than  1,  so  it  must  be  a  constant.  Replacing  X  by  aj  we  see  that 
b  =  /(aj ).  So  the  residue  of  f[X)  modulo  (X  —  a|)  is  /(ai). 

More  generally,  let  F  be  a  field  and  p(X)  a  polynomial  over  F  of  degree  d.  We  write  this  as 
p(X)  G  F|Y].  The  canonical  representatives  of  the  residue  classes  modulo  p(X)  arc  the  elements 
of  F[.Y]  of  degree  less  than  d.  Arithmetic  modulo  p[X)  is  very  similar  to  arithmetic  modulo  an 
integer.  Addition  is  a  bit  simpler  since  the  sum  of  the  two  polynomials  of  degree  less  than  d  will 
have  degree  less  than  d.  Thus  no  normalization  step  is  necessary.  The  product  of  the  residue 
classes  represented  by  /(X)  and  g{X)  is  the  remainder  of  f(X) <j(X)  when  divided  by  p(X).  The 
same  repeated  squaring  algorithm  used  for  modular  arithmetic  can  be  applied  here. 

The  elements  that  are  zero  are  multiples  of  p(X).  Denote  this  set  by  /  These  elements 
form  an  additive  group.  Furthermore,  if  a  is  in  F|Y]  and  b  £  /  then  ab  £  /  These  two  facts 
mean  that  I  is  an  ideal.  This  particular  ideal  is  denoted  by  (p(Y)).  This  notation  indicates  that 
all  the  elements  of  I  are  multiples  of  p{X).  The  residue  class  ring  is  denoted  by  F[A']/(p(X)). 
Similarly,  the  integers  modulo  pare  denoted  by  Z/(p).  (The  ideal  (p)  is  sometimes  also  written 
as  pZ  for  clarity.) 

Thus  far  we  have  placed  no  restriction  on  p( X).  If  F  is  a  ring  instead  of  a  polynomial  then 
the  multiplication  algorithm  will  be  valid  if  and  only  if  p(X)  is  monic. 

If  p{X)  is  reducible  then  not  all  elements  of  F(Ar]/(p(x))  will  be  invertible,  even  if  F  is  a 
field.  To  see  this  recall  the  algorithm  used  in  section  3.1  for  computing  the  inverse  of  an  integer 
modulo  m.  That  algorithm  was  based  on  the  fact  that  we  could  solve 

ap  —  bm  =  1. 

Thus  a  (mod  m)  was  the  inverse  of  p.  In  the  polynomial  case  the  same  algorithm  holds.  Assume 
we  wish  to  compute  the  inverse  of  /( X)  modulo  p(X).  We  can  again  try  to  solve  the  equation 

a(X)f(X)  -  b(X)p(X)  =  1. 

As  before,  this  equation  can  have  solutions  if  and  only  if  f[X)  and  p[X)  are  relatively  prime. 
Assume  for  now  that  they  are  not  relatively  prime.  If  p[X)  is  irreducible  then  f[X)  is  a  multiple 
of  p(X )  and  thus  /( X)  —  0  (mod  p(X)).  Conversely,  if  every  f(X)  which  is  not  coprime  to  p(X) 
is  a  multiple  of  p(X)  then  p( X)  is  irreducible. 

Thus  if  each  residue  class  modulo  p(X)  is  to  have  an  inverse  p(  Y)  must  be  irreducible. 
Furthermore,  it  is  not  hard  to  carry  through  the  algorithm  of  section  3.1  for  polynomials.  In 
order  for  the  final  remainder  to  be  1,  it  will  be  necessary  to  use  the  Euclidean  FRS.  This  means 
that  F  must  be  a  field.  If  these  two  conditions  are  met  then  F(A']/(p(Y))  will  be  a  field.  This 
equivalent  to  saying  that  (p(.Y))  is  a  maximal  ideal. 

We  can  now  finally  return  to  the  Chinese  remainder  algorithm.  It  is  clear  that  just  changing 
the  words  “relatively  prime  integers”  to  “relatively  prime  polynomials”  in  algorithm  C  will  make 
this  algorithm  valid  Tor  the  polynomial  case.  Let  f(X)  be  the  polynomial  that  is  to  be  computed. 
In  its  most  common  usage,  the  Chinese  remainder  algorithm  uses  linear  polynomials  (X  —  pi) 
for  the  generators  of  the  ideal  and  thus  the  residues  are  just  the  values  of  f(X)  at  X  —  p{. 
Furthermore,  the  computation  q  1  (mod  p,)  used  in  step  is  now  equivalent  to  q(p,)  1  since 
/•’  is  a  field.  With  these  observations  we  are  lead  to  the  following  algorithm. 
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Algorithm  D.  Given  two  sets  of  rational  integers  {pi,  .  ...p* •}  and  {  m(l . . . ,  m*  },  returns  a 
polynomial  /(x)  such  that  f(pt)  =  m,  for  1  <  t  <  k. 

Dl.  (Initialize]  Set  J[x)  *—  m\,  q(x)  *—  (x  —  pi). 

D2.  [Loop]  For  t  =  2, ...  ,k  do  step  D3. 

D3.  [Determine  new  /]  Set  f(x)  <—  /( x)  -j-  q(pi)~ —  /(p,))  and  q(x)  (x  —  p,)q(x). 

D4.  [End]  Return  f(x). 

It  is  important  to  note  that  even  if  the  goal  polynomial  for  algorithm  D  is  very  sparse  the 
intermediate  results  can  be  completely  dense.  The  following  example  should  demonstrate  both 
this  and  the  use  of  the  algorithm.  All  computations  are  performed  in  the  field  Z/I63Z. 


Pi 

m,- 

f{*) 

-12 

70 

70 

14 

1-75 

— 62x  —  22 

24 

75 

61x2  -  2lx-  1 

33 

72 

— 28x:i  -  26x2  +  79x  +  62 

51 

55 

-53x 4  -f-  2x  *  —  20x2  -  23x  -  2 

-1 

o 

x5  +  1 

Since  q(x)  is  a  product  of  linear  polynomials,  it  is  almost  certain  to  always  be  dense. 

3.3.  Detailed  Description  of  the  GCD  Algorithm 

As  before  we  wish  to  compute  the  GCD,  G(X),  of  the  primitive  polynomials  /•’i(X)  andF^(X). 
Let  B  be  a  number  which  bounds  the  maximum  or  the  absolute  value  of  the  coefficients  of  G. 
(This  bound  may  be  computed  from  a  theorem  of  Gclfond  [GelOO],  but  in  many  implementations 
[13ro71]  the  maximum  of  the  absolute  values  of  the  coefficients  of  /•’i(A')  and  F-t(X)  is  used  instead.) 
Let  d  =  deg(G).  Pick  k  random  prime  rational  integers  Pi,  • . .  ,pc,  such  that  p\  -  Pk  >  2 B. 
Usually  p,  are  chosen  to  be  less  than  a  machine  word  in  size  for  efficiency.  By  using  the  balanced 
representation  we  can  get  negative  coefficients  also.  Now  the  coefficients  of  Ft(X)  and  Fi(X)  are 
reduced  modulo  each  of  the  p,  successively  and  the  GCD  is  computed  over  Z/p,Z.  Denote  these 
GCDs  by  G,(X).  When  computing  polynomial  remainder  sequences  over  a  field,  the  elements 
will  always  be  monic.  Therefore,  the  GCD  which  we  compute  modulo  p,  will  be  monic.  For 
simplicity  let  us  assume  that  both  F\(X)  and  Ft(X)  arc  monic.  This  restriction  is  removed  later. 

Since  both  F/(X)  and  F-t(X)  arc  assumed  to  be  monic,  their  GCD  must  also  be  monic. 
Therefore  when  reduced  modulo  p, ,  the  CCD’s  degree  will  not  be  decreased.  But  the  degree  of 
G,(X)  can  be  larger  than  the  degree  of  the  GCD.  If  d  is  differs  from  deg(G,)  then  we  say  that  p< 
was  an  unlucky  prime.  If  a  p,  turns  out  to  be  unlucky  it  is  discarded  and  a  new  one  is  chosen. 
(In  practice  a  prime  is  considered  to  be  unlucky  if  dcg(G,)  >  dcg(G;_i).) 

We  now  have  k  polynomials,  each  of  degree  d  which  arc  images  of  G  modulo  p*.  We  can 
now  apply  algorithm  C  to  the  vector  or  constant  terms  of  G, ,  and  then  to  the  linear  terms,  and 
so  on.  The  interpolated  values  are  the  coefficients  of  G  A  slightly  more  efficient  arrangement 
of  this  algorithm  is  expressed  in  Algorithm  M,  which  does  not  require  that  l'i[x)  and  l'i(x)  he 
monic. 

Algorithm  M  takes  two  primitive,  univariate  polynomials  /’|  and  Ft  over  the  integers  and  a 
bound  B  on  the  size  the  coefficients  of  their  GCD  as  inputs  and  returns  their  GCD. 


Ml.  (Initialize]  Set  c  to  the  CCD  of  the  leading  coefficients  of  F|(X)  and  F2(X).  Pick  a  prime 

m  which  docs  not  divide  c  and  set  G{X)  to  the  CCD  of  F'i(X)  and  F2(X)  computed  modulo 

m.  Set  d  *-  dcg(G”)  If  d  is  zero,  immediately  return  1  as  the  CCD. 

M2.  (New  prime]  Pick  a  prime  p  which  does  not  divide  c.  Set  //(X)  to  the  GCD  of  F\(X )  and 

F2(X)  computed  modulo  p. 

M3.  [Impose  leading  coefficient]  Set  H[X)  *—  cXd  -|-  (//(X)  —  X:i)  ■  c~'  where  all  the 
coefficient  arithmetic  is  done  modulo  p. 

M4.  [Unlucky  prime?]  If  dcg(//)  >  d  then  p  was  an  unlucky  prime.  Go  to  M2.  If  deg(//)  <  d 
then  m  was  an  unlucky  prime,  start  all  over  again.  Go  to  Ml. 

M5.  [Loop  over  coefficients]  Set  G  *—  OForO<t<fcdo  step  M6. 

M6.  (Interpolate]  Use  algorithm  C  on  the  coefficients  of  X'  in  G  and  //  with  moduli 
m  and  p.  Let  the  interpolated  result  be  w.  Set  Gt-G-j-  wXl . 

M7.  [Reset  the  world]  Set  G  *—  G,  m  *—  pm.  If  m  >  2/3  then  return  G  else  go  to  M2. 

This  algorithm  can  easily  be  extended  recursively  to  handle  multivariate  polynomials.  A 
detailed  description  is  contained  in  [Pro71],  Here  we  will  present  a  rather  cursory  overview  of 
the  algorithm.  The  main  purpose  of  this  overview  is  to  point  out  the  source  of  the  modular 
algorithm's  exponential  behavior.  After  this  presentation  we  will  demonstrate  another,  more 
inefficient  version  of  this  algorithm  that  exhibits  the  source  of  this  behavior  quite  clearly. 

Assume  that  we  are  again  trying  to  compute  the  GCD  of  the  two  primitive,  monic 
polynomials  F\  (X,  X>, . . . ,  X,,)  and  F2(X,  X2, . . . ,  X, ).  We  will  assume  that  the  coefficients  of  Fi 
and  Fi  lie  in  a  field  (If  the  coefficients  are  rational  integers,  then  they  can  be  reduced  modulo 
some  large  prime  and  the  integer  coefficients  restored  by  a  slight  variation  of  algorithm  M  )  The 
GCD  of  Fi  and  F2  is  the  polynomial  G{X,  X2, . . . ,  X,.).  Assume  we  know  that  none  of  the  X, 
appears  in  G  with  degree  greater  than  d.  The  multivariate  algorithm  begins  by  picking  values 
for  X2  —  a2 o, . . . ,  X,.  =  a,.o,  and  substitutes  these  values  into  F\  and  F2.  The  GCD  of  the 
resulting  polynomials  is  readily  computed  The  value  of  the  constant  term  of  the  GCD  must 
be  the  constant  term  of  G  evaluated  at  X2  =  a2  o,  ...,X,.  =  a,,  o  and  similarly  for  the  other 
coefficients.  The  algorithm  now  picks  d  new  values  for  X^,  a2i, . . .  ,a2,t  and  computes  the  GCD 
of  F\  and  F2  with  X2  =  021, ... ,  02.;  and  X)  =  a.m, . . . ,  X,,  =  a.,,o.  That  is  X2  is  the  only  variable 
whose  value  changes  The  coefficients  of  the  d- j-  I  univariate  polynomials  may  be  interpolated 
using  algorithm  D  to  compute  the  bivariate  GCD  of  F\  and  F2  at  Xj  =  ajo, . . .  ,XV  =  a,,o, 
G(X,  X2,a3o, .  •  ■ ,  a„o). 

Now  a  new  value  is  chosen  for  Xj,  an ;  d  -|- 1  values  are  chosen  for  X2,  and  the  interpolation 
procedure  of  the  previous  paragraph  is  repeated.  This  yields  G(X,  X2,  an , ... ,  a,.<i).  Using  d  —  l 
bivariate  polynomials,  G(X,  X2,  ■ , . . . ,  a,.<>).  Now  the  coefficients  of  these  polynomials  can  be 
interpolated  using  algorithm  D  to  compute  G(X,  X2,  X;j , . . . ,  am).  The  computation  of  this 
polynomial  required  d  -)-  1  bivariate  polynomials  be  computed.  Each  bivariate  polynomials 
required  d  -f-  1  univariate  GCDs  so  thus  far  we  have  used  (d  -f-  1)*  univariate  GCDs.  To 
compute  the  four  variable  GCD  we  will  need  d  -f  I  trivariate  GCDs  and  will  thus  require 
(d  I )■*  univariate  GCDs,  and  the  whole  problem  will  require  (d  -f  I )'  1  GCDs.  This  is 

illustrated  by  figure  1  for  v  =  4. 

Each  level  of  the  tree  has  a  branching  factor  of  d  I  so  one  can’t  help  but  have  exponential 
behavior  In  chapter  III  we  will  show  how  to  slash  the  whole  structure  of  the  tree  so  that  it 
does  not  have  an  exponential  number  of  leaves. 

Now  we  will  present  another  way  of  looking  at  the  GCD  problem  that  yields  some  insight 
into  the  necessity  for  the  exponential  growth.  Basically  we  will  demonstrate  how  the  modular 


Figure  1. 


algorithm  can  be  shown  to  be  determining  a  number  of  coefficients.  Since  there  are  an  exponential 
number  of  these  unknown  coefficients  this  algorithm  must  be  exponential.  This  version  also 
leads  more  naturally  to  our  new  “sparse”  modular  algorithm. 

Again  F\(Z,  Z2,  ...  ,  Z,,)  and  F2(Z,  Zit . . . ,  Z,.)  are  primitive,  irionic  polynomials  and  we  want 
to  compute  their  GCD,  G{Z,Zit  .  ,Z„).  In  place  of  the  hound  B  that  we  had  for  the  size  of  the 
coefficients  in  the  univariate  case,  we  need  a  bound  on  the  degree  ofZj,...,  Zv  in  (I(Z,  Zlt . .  .  ,ZV). 
Assume  that  no  Z,  appears  with  degree  greater  than  d  in  G.  Given  this  we  can  write 

d  d 

G(Z,Z2,...,ZV)=  £•••  ^cir..ivZi‘Z^--Ztj 

i,=0  iv=0 


and  we  wish  determine  the 

A  very  simple  way  to  determine  the  coefficients  of  G  is  to  set  up  a  system  of  linear  equations 
for  which  the  unknown  coefficients  will  be  a  solution.  There  arc  ( d  +  1)'  of  these  unknown 
coefficients,  c„  ti ,.  Pick  integers  for  an,.. .  ,ai„  and  compute 

G|  =  gcd(Ft(Z ,  a  12, . . . ,  aiv),F2(Z,ai2,  ■  ■  ■ ,  «!«))• 

[f  we  write 

G,(Z)  =  9i,uZd  +  +  •  •  •+  gi4 

then  c<|uating  the  coefficients  of  Z  produces  the  following  d  -)-  1  relations  among  the  c», 

d  d 

St,j  —  ^  cj,'i'<ao,2' ' ' nti,v  j  —  0,..  .,d  (4) 

»:i=0  »'„=() 

Repeating  this  process  by  computing  (d  f  l)1’  —  1  more  univariate  GCD’s,  G2,  . . , 

enough  linear  equations  arc  produced  to  determine  the  unknown  coefficients  of  G(Z,  Z2, .  ■  ■ ,  Z„). 

This  algorithm  will  use  O(c')  steps  to  compute  the  GCD  of  two  polynomials  involving 
v  variables.  If  the  GCD  is  dense  then  this  is  the  amount  of  time  you  would  expect  a  GCD 
algorithm  to  take  since  there  will  be  (d  -f-  I)1  terms  in  the  GCD,  and  thus  (d  -f  l)’’  coefficients 
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that  need  to  be  determined.  In  the  case  of  sparse  polynomials  most  of  the  coefficients  will  be 
zero.  The  modular  algorithm  will  still  set  up  the  large  system  of  equations  (4)  and  will  still  take 
exponential  time  to  solve  the  system. 

The  major  advantage  of  the  modular  algorithm  is  that  it  never  introduces  intermediate 
results  that  are  larger  than  the  answer.  But  its  running  time  is  also  independent  of  the  structure 
of  the  answer.  This  can  prove  to  be  very  unfortunate  in  the  many  common  uses  of  the  algorithm 
where  the  expressions  involved  are  very  sparse. 
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Chapter  III 


Sparse  Modular  Algorithm 


All  modular  algorithms  have  basically  the  same  form.  The  value  of  a  polynomial  is  computed 
at  a  number  of  points  and  these  values  are  interpolated  to  produce  the  original  polynomial.  We 
will  call  this  polynomial  the  goal  polynomial  of  the  algorithm.  The  goal  polynomial  is  assumed 
to  involve  v  variables.  Each  variable  appears  to  no  higher  degree  than  d  in  the  goal  polynomial. 
The  goal  polynomial  will  be  denoted  by  P(X{ , . . . ,  Xv). 

There  are  (d  +  1)’’  independent  coefficients  in  P.  If  "many”  of  them  are  zero  then  P  is  said 
to  be  sparse,  otherwise  it  is  dense.  We  will  denote  the  number  of  non-zero  coefficients  by  l.  If 
P  is  completely  dense  it  will  have  t,i  =  (d  1)”  non-zero  coefficients.  For  sparse  polynomials 
*«:(*  + 1)”. 

Any  interpolation  algorithm  that  computes  P  must  determine  (d  -f  l)1'  coefficients.  If  less 
than  t,i  points  arc  examined  there  will  be  an  infinite  number  of  possible  polynomials  that  satisfy 
the  appropriate  degree  bounds  and  that  take  on  the  correct  values  at  the  test  points  To  see 
this  notice  that  each  point  that  is  examined  yields  a  single  linear  constraint  on  the  coefficients 
of  P 


d  d 

ft)=  X]  C<I  *»“ i'  -av 

i|— 0  iv—0 

If  there  arc  fewer  linear  equations  than  coefficients  to  be  determined,  the  system  of  equations 
will  be  underdetermined. 

Just  looking  at  (d  -|~  I )*’  points  requires  time  exponential  in  the  number  of  variables.  Thus 
any  modular  interpolation  algorithm  that  can  guarantee  that  its  results  are  correct  must  require 
an  exponential  amount  of  time  to  compute  P. 

This  chapter  presents  a  probabilistic  algorithm  that  computes  P  in  polynomial  time.  Since 
this  algorithm  is  probabilistic,  there  is  some  chance  that  the  polynomial  it  returns  as  P  is  not  the 
desired  one.  However,  the  chance  of  being  in  error  can  be  made  arbitrarily  small  by  increasing 
the  size  of  the  set  from  which  the  test  points  arc  chosen. 
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1.  Overview  of  the  Sparse  Modular  Algorithm 

The  fundamental  idea  behind  the  sparse  modular  algorithm  can  be  seen  by  comparing 
figure  1  of  section  11.3.3  with  the  figure  of  this  section  As  before,  a  sequence  of  points  is  chosen, 
Xi  —  a-u i, . . . ,  A’,.  ~  a,,()  and  the  univariate  CCI)  is  computed.  This  is  done  for  d  4  1  different 
values  of  Xg.  The  resulting  univariate  CCDs  are  interpolated  using  algorithm  1)  of  chapter  II  to 
give  a  bivariate  polynomial.  As  before,  a  new  value  of  Xj  is  chosen  and  a  new  bivariate  CCD  is 
computed,  but  this  time,  instead  of  using  d  -f-  I  univariate  CCDs  we  use  some  of  the  structure 
determined  by  the  first  bivariate  GC1).  We  assume  that  the  coefficients  of  X",  that  were  non-zero 
in  the  first  CCD  will  be  the  only  non  zero  coefficients  (In  the  figure  we  have  assumed  that 
there  arc  only  two  non-zero  coefficients  in  the  answer.)  Now  a  different,  slightly  less  efficient 
interpolation  algorithm  is  used,  but  fewer  CCDs  were  necessary  to  compute  the  base  points  for 
the  interpolation.  As  this  proceeds,  the  recursive  nature  of  the  algorithm  disappears  along  with 
the  exponential  behavior. 

The  modular  algorithm  presented  in  section  11.4  solved  one  large  system  of  equations  to 
determine  the  coefficients  of  the  goal  polynomial.  The  sparse  algorithm  begins  by  choosing  a 
starting  point  for  the  interpolation,  (asm,  •  •  ,  x„o).  It  then  produces  the  sequence  of  polynomials, 

f*{X\ ,  XU) >  •  •  • ,  Xvo), 

P2  P(  A) ,  X‘h  X30,  -  -  • ,  aSy(>), 

Pv  =  P(Xl,X2,...,Xv). 

The  process  of  computing  Pi  from  P,_  1  is  called  a  stage  of  the  lifting  process. 

Pi  is  a  univariate  polynomial  in  X\ .  The  coefficient  of  X*  in  P  is  a  polynomial  fa(X-i, . . . ,  Xv). 
If  P  is  sufficiently  sparse  there  will  be  certain  powers  of  X|  that  do  not  appear  in  /'1 .  (If  f  <  d-f-  l 
then  X|  cannot  appear  in  P  to  all  d  4  l  powers.)  Assume  that  the  X*  term  is  one  of  those 
terms  that  is  not  present.  There  arc  two  possible,  reasons  why  X^  docs  not  appear  in  l\ .  [Cither 
/t.  is  identically  zero  or  fa [r-2o, .  ■  ■ ,  x,n)  is  equal  to  zero.  The  probability  that  fafom,  ■  ■  ■ ,  *„o) 
is  zero  when  fa  is  not  identically  zero  is  extremely  small  if  the  starting  point  (x^u,  . ,  x,.n)  is 

chosen  at  random.  Thus  the  probability  that  fa  is  identically  zero  is  quite  large.  The  key  idea 
in  this  algorithm  is  to  assume  that  X*  does  not  appear  in  I i.e  ,  fa  is  identically  zero.  Thus  it 
is  assumed  that  the  coefficient  of  every  monomial  involving  X'j  is  known,  and  that  it  is  zero. 
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This  information  is  used  to  construct  i\  Now  the  same  reasoning  can  be  applied  to  each 
monomial  in  X|  and  X 2  that  does  not  appear  in  P>  Since  there  are  at  most  t  terms  in  any  of 
the  P, ,  almost  all  of  the  terms  will  be  zero,  and  the  number  of  coefficients  that  will  need  to  be 
computed  is  small. 

A  few  comments  on  the  chances  of  fi,  not  being  identically  zero  are  in  order  This  probability 
depends  only  on  the  range  from  which  xyo  . .  ,  x,.n  is  chosen  For  example,  assume  v  —  2.  Then 
fk  involves  only  one  variable  and  //,  has  only  a  finite  number  of  zeroes,  r.  If  we  pick  zyo  randomly 
from  a  set  with  B  elements  the  probability  that  we  have  picked  a  zero  of  /*  is,  at  most,  r/B. 

This  argument  is  generalized  in  chapter  IV. 

For  simplicity  we  will  walk  through  the  algorithm  when  P  is  a  polynomial  in  3  variables, 

P(X,  Y,Z).  As  usual,  we  assume  that  P  is  a  sparse  polynomial  with  t  terms  (t  «  {d  +  I)*’, 
v  =  3).  (Whenever  we  say  “pick  x,  randomly”  we  will  mean  pick  an  integer  x,  r?  donily  from 
a  set  $  that  has  at  least  B  distinct  elements.1)  Fick  and  z>)  randomly  These  i\vo  numbers 
must  be  chosen  randomly  and  are  the  only  numbers  that  need  to  be  chosen  al  random  for  this 
example. 

We  now  pick  xo, . . .  ,x,t  and  examine  the  values  of  P  at  the  points  (x, ,  yo,3i).  These  may 
be  interpolated  by  the  Lagrange  interpolation  formula  to  give  a  univariate  polynomial  in  X, 
namely  P(X,  U>,A>)  (Actually,  the  Lagrange  interpolation  formula  is  probably  more  rightly 
attributed  to  Waring  (War79].)  So  far  nothing  probabilistic  has  entered  the  algorithm. 

We  now  assume  that  if  Xk'  (for  some,  k)  has  a  coefficient  of  zero  in  P(X,  pn,ai)  then  it  will 
have  a  zero  coefficient  in  P( X,  Y,Z).  Pick  a  y\,  not  necessarily  at  random.  From  P(X,  we 
know  that  a  number  of  the  coefficients  of  P(X,  y\ ,Z(\)  are  zero.  In  particular,  since  the  coefficient 
of  Xk'  is  zero  in  P[X,  !/d,3i)>  we  will  assume  that  the  coefficient  of  Xk  in  P(X,y\,zn)  is  also 
zero.  Thus  the  only  coefficients  of  P{X,y\,zo)  that  need  to  be  determined  are  the  those  that 
arc  suspected  to  be  non-zero,  i.e.  the  coefficients  of  monomials  that  appeared  in  P[X,  yp,zi>). 

There  can  be  no  more  than  t  of  these  unknown  coefficients,  since  there  are  only  t  terms  in  the 
answer.  These  coefficients  can  be  determined  by  solving  a  system  of  linear  equations.  Only  the 
values  of  P(x 0,1/1,31),  . . ,  P{xi,  y\ ,  2p)  will  be  needed  to  set  up  this  system  of  equations. 

This  procedure,  may  be  repeated  until  we  have  determined  d-\-  J  polynomials  P(X,  pp,2p), . . . ,  P(X,  y,i,Zo)- 
Pick  a  monomial  in  A'  that  appears  in  each  of  these  polynomials.  For  simplicity  we  will  assume 
that  it  is  the  linear  term.  The  coefficient  of  X  in  P(X,  T.zp)  is  a  polynomial  in  Y  of  degree  at  most 
d.  Call  this  polynomial  J[Y).  Front  the  d-j  I  polynomials  we  have  computed  we  can  determine  the 
values  of  f(Y)  at.  tjlt.  Again  using  the  usual  interpolation  methods  we  can  determine  f(Y) 

from  this  information.  Repeating  this  for  the  quadratic  and  higher  terms  of  P(X,  t/pzp)  we  can 
determine  P(X,  T,^). 

Now  that  we  have  P{X,  T,^),  it  is  only  natural  to  to  compute  P( X,  Y,Z[)  for  a  new  z\ , 
which  does  not  need  to  be  chosen  at  random.  This  can  be  done  in  a  manner  almost  identical  with 
that  used  earlier.  We  know  that  the  monomials  that  appear  in  P(X,  F,Z|)  will  have  non-zero 
coefficients  in  P(X,  Y,Z).  Just  as  in  the  stage  where  we  introduced  the  polynomials  in  Y,  we 
assume,  that  none  of  the  X'  YJ  monomials  that  appear  in  P(X,  Y,  Z)  arc  absent  from  P{X,  Y,  zn). 

There  arc  at  most  t  of  these  monomials  Picking  a  z\,  we  sec  that  there  arc  at  most  t  unknown 
coefficients  to  be  determined  in  /'(A',  F,zi).  All  the  others  arc  believed  to  be  zero.  Picking  t  new 
pairs  of  values  (xi ,  yi ),  •  •  ,(x(,y()  and  computing  /*(x,,  tjj, z\),  we  can  set  up  a  system  of  linear 
equations  in  the  unknown  coefficients.  Solving  this  system,  we  have  l\X ,  Y,z\)\  repeating  this 
procedure  we  will  finally  determine  P(X,  Y,z,f).  Now  we  can  determine  the  polynomials  in  Z 

'The  sol  f  is  usually  cliosrn  lo  lie  Ihe  sel  of  inleuers  in  the  interval  ((),// —  I],  hut  can  he  any  set  of  integers  with 
It  elements.  Throughout  this  section  lower  case  symbols  will  denote  integers  chosen  al  random  while  uppercase 
symbols  will  he  reserved  for  variables. 


that  arc  coefficients  of  each  X'Y1  monomial  by  (he  usual  interpolation  schemes. 

There  are  two  essentially  <lilFerenl  types  of  interpolation  schemes  going  on  in  this  algorithm. 
The  first  time  we  try  to  generate  a  polynomial  in  X,  it  is  not  known  what  its  structure  is  and 
thus  the  interpolation  is  performed  as  if  the  polynomial  were  dense.  That  is  we  used  d  -f- )  points 
to  determine  a  polynomial  of  degree  d.  This  we  call  a  dense  interpolation.  Now  that  we  have  this 
skeleton  from  which  to  work,  we  can  produce  additional  polynomials  in  X  by  solving  systems 
of  equations.  These  steps  are  called  sparse  interpolations.  These  coefficients  are  then  combined 
via  a  dense  interpolation  to  give  polynomials  in  }  .  The  algorithm  proceeds  in  this  manner. 
The  first  polynomial  produced  involving  a  particular  variable  is  done  via  a  dense  interpolation. 
The  structure  determined  by  the  dense  interpolation  is  then  used  to  produce  a  skeleton  for  the 
polynomial.  This  skeleton  is  used  as  the  basis  for  a  series  of  sparse  interpolations  that  are  done 
to  set  up  the  points  for  a  new  variable. 

2.  General  Formulation  of  Sparse  Modular  Algorithm 

In  this  section  we  will  present  the  precise  form  of  the  sparse  modular  algorithm  so  it  may  be 
applied  to  a  number  of  examples.  We  will  present  these  algorithms  using  the  same  syntax  that 
wc  used  in  presenting  the  Chinese  remainder  algorithm  in  chapter  II.  Making  these  algorithms 
precise  in  this  manner  will  also  aid  in  the  analysis  of  the  algorithm. 

The  first  algorithm  wc  consider  is  used  when  a  dense  lifting  is  required;  it  is  an  extension 
of  the  Chinese  remainder  algorithm  to  polynomial  rings.  We  have  a  polynomial  f[x)  and  two 
sets  of  rational  integers  {pt,.. . ,  p*-  }  and  {m  }  such  that  /(p,)  =  m,.  We  want  to 

determine  f[x).  Let  p  —  (x  —  p, )  denote  the  principal  ideal  of  Q(;r|  generated  by  x  —  p,.  Since 
f(x)  is  an  element  of  the  ring  Q|x|,  the  m,  are  (he  images  of  /(r)  under  the  map 

A  ■  QM  —  Q|x|/(z  ~  Pi)  =  Q 

So  one  way  of  expressing  the  relationship  between  the  /,  p,  and  m,  is 

/( i)  =  in,  (mod  p^)  t  =  l, . . .  ,k. 

Thus  using  precisely  the  same  arguments  as  were  presented  in  section  II. 3. 1  we  can  justify  the 
following  algorithm. 

The  sparse  modular  algorithm  needs  a  data  structure  to  indicate  which  terms  are  known  to 
be  zero.  Since  (here  arc  fewer  terms  that  arc  likely  to  have  nonzero  coefficients  than  terms  with 
zero  coefficients,  wc  will  keep  track  of  the  nonzero  terms.  A  monomial  of  the  form  X\'  -  Xeuv 
will  he  represented  by  the  n-tuplc  (ci , . . . ,  < A  skeletal  polynomial,  S,  is  understood  to  be  a 
set  v-luplcs  such  that  each  element  of  .S’  represents  nonzero  term  in  the  goal  polynomial. 

Whenever  a  skeletal  polynomial  is  produced,  we  will  waul  lo  determine  the  coefficients  that 
arc  determined.  This  will  be  done  by  solving  a  system  of  linear  equations.  To  simplify  the 
notation  a  bit  wc  will  adopt  the  following  convention.  Assume  a  skeletal  polynomial  S  contains 
t  terms.  We  will  assume  that  each  skeletal  polynomial  has  associated  with  it  l  symbols -that 
will  represent  the  coefficients  of  the  monomials  given  by  .S'.  Denote  these  symbols  by 
where  the  subscript,  t,  is  associated  wilh  the  exponent  vector  (c, |, . . . ,  c, Then  we  define 

s(«i,. . =  si«j"  •  -f  h —  -f- «,«?'••  -a;*-. 

The  sparse  modular  algorithm  can  be  specified  as  follows. 
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Algorithm  S  takes  a  set  of  variables  {  X|, . . . ,  Xr },  a  degree  bound  <1,  a  function  F[X\, . . .  ,XV) 
and  a  starting  point  (oi,. . .  ,a„)  as  arguments.  It  is  assumed  that  the  values  P  returns  are  the 
values  of  some  polynomial  of  at  most  v  variables  and  of  degree  at  most  d  in  each  variable. 
The  starting  point  is  assumed  to  be  a  good  starting  point.  The  algorithm  returns  a  polynomial 
P(X  i,...,X,.),  where  each  variable  occurs  with  degree  no  more  than  d  and  P(b] , . . .  ,bv)  = 
F(bt , . . . ,  bv)  for  all  integers  6;. 

SI.  [Initialize]  Set  S  *—  {  (0)  }  and  pn  ♦—  a q. 

52.  [Loop  over  variables]  For  i  =  1  through  v  do  S3  through  S8. 

53.  [Iterate  d  times]  For  j  —  1  through  d  do  Sd  through  S7. 

54.  (Initial  linear  c(|iiations)  I’ick  rJt  set  L  to  (he  empty  list,  set  t  to  the  length  of  S. 

55.  [Iterate  t  times]  For  k  —  1  through  t  do  S6. 

56.  [Set  up  linear  equations]  Pick  an  (f  —  l)-tuple  Aa,  and  add  the  the  linear  equation 
S{ Aa)  =  F(\k,rJ>aJ  (_i,...,a„)  to  L. 

57.  [Solve]  Solve  the  system  of  equations  L  and  merge  the  solution  with  S  to  produce 
a  polynomial  Pj{X\, ....  X,  -i). 

S8.  [Introduce  X,]  For  each  monomial  in  S  pass  the  corresponding  coefficients  from 
po, . . . ,  p,t  and  a,;,  rq , . . . ,  rj  to  algorithm  D.  This  will  produce  i  polynomials  in  X,  that 
can  be  merged  with  S.  Set  pu  to  this  new  polynomial  and  S  to  its  skeletal  polynomial. 

S9.  [Done]  Return  po. 

There  is  one  point  at  which  caution  should  be  exercised  in  implementing  this  procedure. 
The  first  time  through  the  i  loop  the  linear  equations  that  are  set  up  will  be  trivial  since  there 
is  only  one  unknown.  There  is  a  chance  that  the  linear  equations  that  are  developed  will  not 
be  independent.  If  this  happens  then  it  is  necessary  to  execute  step  S(>  until  sufficiently  many 
independent  equations  arc  produced 

Since  wc  must  solve  a  system  of  linear  equations  in  this  algorithm,  it  is  usually  best  to 
perform  the  entire  computation  over  some  finite  field  as  we  mentioned  in  the  remarks  about 
algorithm  D. 

3.  An  Example 

To  illustrate  the  sparse  modular  algorithm  wc  will  compute  the  linear  term  of  the  GCD, 
over  the  integers,  of  the  two  polynomials 

F\(X,  Y,Z)  =  Xr'+(Y2  +  YZ  +  1)X'  +  (2K3  -  7 YZ2  ~  2)X3 

+(2T'5  4-  2T'Z  —  7 Y'Z'1  +  2 Y3  —  7 Y2Z3  -  7 YZ2  4-  3)X2 
4-(-'iy:)  +  3F2  4-  IAYZ2  +  3 YZ  4-  3)X  —  6, 

Fi(X,  Y,Z)  -  X(i-F(2F:I  -  7YZ1  4  Y  —  6 Z*)X*  4-  6X3 
4-(2F1  -  12Y3Za  ■—  7 Y2Z 2  4-  mr'Y)X2 
4-(6F 3  -  21  YZ1  4-3 Y  -  18 Z*)X  4-  9. 

The  basic  technique  will  be  use  the  sparse  modular  algorithm  to  compute  the  GCD  over 
several  finite  fields.  These  CCDs  are  then  interpolated  to  determine  the  correct  GCD  over 
the  integers.  The  first  problem  is  to  compute  a  bound  for  the  size  of  the  coefficients  of  tfic  GCD. 
Following  lirown  |ltro7l]  we  will  assume  that  the  magnitude  of  the  coefficients  of  the  GCD 


of  two  polynomials  is  bounded  by  the  product  of  the  absolute  values  of  the  largest  coefficient 
in  each.  Thus,  in  our  example  we  assume  that  no  coefficient  in  the.  GCI)  will  lie  larger  than 
14  X  21.  Since  we  do  not  know  the  sign  of  the  coefficients,  they  must  be  determined  modulo 
some  number  larger  than  2  X  14  X  21  =  588. 

Since  the  maximum  degree  of  Y  in  F  and  G  is  5  and  3  respectively,  it  can  appear  to  degree 
no  higher  than  3  in  the  GCD.  Similarly  Z  cannot  appear  to  any  degree  higher  than  4  in  the 
GCD.  Since  both  F  and  G  arc  monic  polynomials  //  must  be  monic  also.  We  need  not  worry 
about  adjusting  the  leading  coefficient. 

In  a  real  implementation  the  primes  would  probably  be  chosen  to  just  fit  in  a  word.  For 
most  computers  we  would  then  need  only  one  prime.  To  demonstrate  how  these  algorithms 
work  in  general  we  will  use  two  smaller  primes,  19  and  31. 

Wc  will  only  try  to  compute  the  coefficient  of  the  linear  term  in  X  of  the  GCD;  the 
other  coefficients  are  computed  similarly  and  should  be  computed  in  parallel.  Thus  the  goal 
polynomial  will  involve  two  variables,  Y  and  Z.  We  will  denote  the  GCI)  by  G(X,Y,Z)  and 
the  goal  polynomial  by  P[Y ,Z).  The  starting  point  is  chosen  at  random  The  integers  involved 
might  as  well  be  less  than  the  modulus  chosen.  Starting  with  19  as  the  modulus  we  chose  Y  =  12 
and  Z  —  15  as  the  starting  point. 

In  the  notation  of  algorithm  S,  the  goal  polynomial  is  P{ X\,  X;),  X\  ~  Y ,  Xi  =  Z  and  the 
starting  point  is  a\  =  12  and  at  =  15.  With  Z  fixed  at  15  we  compute  the  GCD  of  F  and  G 
with  Y  varying  over  several  values.  Actually,  only  four  are  necessary. 


Y 

Z 

G(X,V) 

12 

15 

X3  +  3X  +  3 

0 

15 

X:,  +  3 

11 

15 

X'  +  5X  +  3 

13 

15 

X'  -  IX  +  3 

2 

15 

X 1  +  5;i  +  X2  +  8X  -  4 

5 

15 

X3  -6X  +  3 

From  the  first  polynomial  it  is  clear  that  the  degree  of  the  GCD  is  not  more  than  3.  So  the 
next-to-last  polynomial  can  be  thrown  out  since  (2,  15)  must  be  an  "unlucky  evaluation  point” 
[Bro71]. 

From  this  evidence  it  seems  likely  that  the  quadratic  term  in  If  is  zero.  If  the  modulus  were 
somewhat  larger,  so  that  the  interpolation  points  could  be  chosen  from  a  larger  set,  we  would 
have  significantly  more  confidence  in  this  result.  As  we  shall  see,  this  problem  was  chosen  so 
that  none  of  these  guesses  which  might  go  awry  actually  do.  It  also  seems  that  the  constant 
term  is  independent  or  T.  At  any  rate,  it  is  easy  to  interpolate:  it  must  be  3. 

The  only  interesting  term  is  the  linear  one.  It  is  determined  by  interpolating  the  linear 
coefficients  of  II  using  algorithm  D. 
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p, 

ml 

P{Y,) 

0 

0 

0 

11 

5 

04-  h  Y  5 

13 

—7 

-9T2  4-  Y 

5 

-6 

2Y'  +  2Y 

Algorithm  D  results  in  (7(Ar,  V,  t5)  =  X  1  -j-  (2Y !  2/)^  -f-  3.  Nothing  terribly  exciting  or 

new  has  happened  yet.  So  far  we  have  not  had  a  chance  to  use  the  sparse  modular  algorithm. 
We  now  need  to  compute  G( X ,  Y,  )  for  several  other  values  of  Z.  This  is  done  using  the  sparse 
algorithm.  We  first  pick  a  new  value  of  Z ,  Z  ~  7.  We  need  only  to  pick  2  values  for  Y  at  this 
stage  since  there  are  just  2  coefficients  of  the  linear  term  to  be  determined.  We  have  assumed 
that  the  coefficients  of  the  Y~ X  and  YnX  terms  are  zero.  Picking  a  couple  of  values  for  Y  and 
computing  the  univariate  GCDs, 


Y 

Z 

G(X,;) 

4 

7 

X3  -  9*  4-  3 

13 

7 

*;1  -  8*  4-  3 

We  know  that  the  goal  polynomial  (coefficient  of  the  linear  term  in  X)  is  of  the  form 
C|  Y 1  -f-  CiY,  so  we  get  the  linear  equations 

7d  4-  4cz  =  —9 

— 7c |  -f-  13c2  :==  — '8 

These  equations  can  be  easily  solved,  giving  c\  —  2  and  c-j  —  — 1.  Note  that  this  time  only  two 
univariate  GCl)'s  were  necessary,  while  four  were  needed  for  the  previous  interpolation.  This 
procedure  must  be  repeated  for  three  more  values  of  Z.  Each  time  a  pair  of  linear  equations 
must  be  solved.  The  results  are 


|  GCDs 

Z 

G(X,  Y,  ) 

4 

15 

X]  4-  (2T;i  4-  2)X  4-  3 

2 

7 

X3  4-  {2Y:i  -  Y)X  4-  3 

2 

13 

4-  (2 Y3  -  5 Y)X  4-  3 

2 

3 

X3  4-  (2T1  —  6K)X  -f  3 

2 

13 

*  ' 4- (2T:i  4- 8T)X  4- 3 

In  the  first  column  we  have  given  the  total  number  of  additional  univariate  GCDs  required  to 
solve  determine  the  particular  polynomial  Interpolating  these  coefficients,  we  see  that 

G(X,  Y,Z)  =  X  '  4-  (2  Y3  -  7 YZl)X  4-  3. 

Now  we  change  the  modulus.  Picking  two  values  of  Y  and  Z,  with  the  modulus  set  at  31 
we  get  the  following  univariate  GCDs. 

G.{X,  22, 19)  =  X  '  —  12*  4-  3 
G(X,  17,  21)  =  A"  4-3*  4-  3 
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Since  the  linear  term  of  //  is  believed  to  be  of  the  form  CjY3  -f-  C-\YZ 2  we  can  use  these  two 
results  to  set  up  the  following  two  linear  equations: 

1 5c  i  -f-  6c.  i  =  -12 
15c;)  —  5c.|  =  3 

These  two  equations  can  be  solved  to  give  a  =  2  and  c.j  —  — 7.  Thus 

G(X,  Y,  Z)  =  X*  +  (2 Y3  -  7YZ2)X  +  3. 

Clearly  this  must  be  the  GCD.  The  total  number  of  univariate  GCDs  which  was  needed  was 
only  14.  Had  we  used  the  old  modular  algorithm  we  would  have  required  40  univariate  GCDs. 

4.  Analysis 

The  purpose  of  this  section  is  to  show  that  the  time  to  compute  one  candidate  solution  to 
a  problem  using  the  sparse  modular  algorithm  is  polynomial  in  the  size  of  the  candidate  and  e, 
where  t  is  a  bound  on  the  probability  that  the  candidate  is  erroneous. 

Denote  the  goal  polynomial  by  P(X i, . . . ,  X,.)  and  the  starting  point  by  a  =  (ai , . . . ,  a,.).  The 
a,  are  chosen  at  random  from  a  set  of  B  possible  values.  When  c  is  chosen  to  be  very  small,  B  will 
be  quite  large.  Thus  the  a,  may  be  quite  large  and  we  must  be  somewhat  careful  when  including 
the  cost  due  to  integer  arithmetic.  When  using  classical  algorithms,  the  cost  of  multiplication 
and  division  is  quadratic,  in  the  length  of  the  integers  while  the  cost  for  addition  and  subtraction 
is  linear.  To  crudely  take  this  into  account  we  will  count  the  number  of  arithmetic  operations 
with  integers  and  multiply  that  by  log2  B. 

Recall  that  it  is  the  dense  interpolation  (using  the  Chinese  remainder  algorithm)  that 
introduces  new  variables  as  the  goal  polynomial  is  built  up  The  sequence  of  polynomials  that 
is  produced  is 

P(X,  a,,),  P(Xh  X2,  a,, ....  a,.), . . . ,  P(X, ,  X2, . . . ,  X„).  (1) 

Next  there  is  a  scries  of  sparse,  interpolations  that  form  the  basis  for  the  next  dense  interpolation. 

These  sparse  interpolations  use  a  skeletal  polynomial  as  a  guide  to  the  structure  of  the 
system  of  linear  equations  must  be  solved,  if  the  true  goal  polynomial  docs  not  conform  to 
the  structure  indicated  by  the  skeletal  polynomial  then  the  next  dense  iteration,  in  addition  to 
being  wrong,  will  almost  certainly  be  dense  and  the  world  will  come  crashing  down  afterwards. 

Thus  the  entire  algorithm  depends  upon  the  accuracy  or  the  skeletal  polynomials.  Since  the 
skeletal  polynomials  arc  extracted  from  the  structure  of  the  polynomials  in  the  sequence  (1), 
it  is  important  to  know  if  P{X|,a2,.. .  ,a,.)  has  the  wrong  number  of  terms.  Clearly  it  cannot 
have  too  many  terms.  If  it  has  too  few  terms  then  the  coefficient  of  some  X*  in  P(X|, . . . ,  X„) 
is  zero  at  a.  Let  F|  be  the  product  of  the  nonzero  coefficients  of  X*  in  P  for  k  =  1  through  d. 
If  a  is  not  a  zero  of  F|  then  the  first  skeletal  polynomial  will  be  computed  correctly. 

Similarly  if  the  coefficient  of  some  monomial  in  X\  and  X2  is  zero  at  d  the  second  skeletal 
polynomial  will  be  erroneous.  Define  F2  to  be  the  product  of  the  coefficients  of  nonzero  monomials 
in  X|  and  X2  in  the  goal  polynomial  and  define  Fj, . . .  ,F,,__|  similarly.  If  the  starting  point  is 
not  a  zero  of  any  of  the  F,,  then  none  of  the  skeletal  polynomials  will  be  erroneous. 

The  auxiliary  polynomial  for  P  is  defined  to  be  F  =  F\F2  -  ■  Fv—\.  F  is  a  polynomial 
in  X2, . . . ,  X,..  The  key  assumption  used  in  the  sparse  modular  algorithm  is  that  the  initial 
evaluation  point  is  not  a  zero  of  this  polynomial.  As  we  shall  sec,  in  the  sparse  version  of  the 
llensel  algorithm  the  determinant  of  the  Jacobian  will  play  the  part  of  the  auxiliary  polynomial. 
A  point  that  is  not  a  zero  of  F  is  called  a  good  point.  Since  all  bad  points  satisfy  F  =  0  they 
form  a  variety  of  codimension  1.  Qualitatively,  almost  all  points  in  affine  (0  —  l)-space  are 
good  The  following  lemma  makes  this  precise. 
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Lemma  1.  Let  f  6  Z[X|, ....  X,,]  and  the  degree  of  f  in  X,  be  bounded  by  D.  Let  NV[D)  be 
the  number  of  zeroes  of  f,  (x[, ,  xv)  such  that  x,  £  S  (a  set  with  B  elements  B  D).  Then 
N,  (B)  <  B'  —  (B  —  D)v. 

Proof:  There  are  at  most  D  values  of  X,  at  which  /  is  identically  zero.  So  for  any  of  the  D 
values  of  X,.  and  any  value  for  the  other  X,,  /  is  zero.  This  comes  to  DB"~l.  For  all  other 
B  —  D  values  of  X,.  we  have  a  polynomial  in  v —  1  variables.  The  polynomial  can  have  no  more 
than  NV-\(B)  zeroes  Therefore, 

NV(B)  <  DB ’  +  (B  -  D)N„_|(B). 

Let  N„  =  (B  —  D)v~[nv  then 

.  DBv~l 
-  {B-DY~ 1 

For  v  —  1,  N\  <  D,  so  we  have  ni  <  D.  Summing  the  previous  equation 


z  ^  OB' 

n,’-n,+§(^FT 


<  o'y*  < D( [A)vr±\ 

hAD-°y  v  A-1  ) 
ir=w-1) 


By  the  definition  of  n,.  we  have 


Nv  =  {B~  D)'  ~~ 1  nt,  <  (B  -  -  l). 


The  lemma  follows  immediately.  | 

This  bound  is  actually  attained  by  the  dense  polynomial 


One  would  expect  a  much  tighter  bound  to  hold  for  sparse  polynomials.  To  get  an  idea  of 
how  much  better  than  this  bound  a  typical  polynomial  can  be  consider  the  following.  Let 

/(X)  =  /(X, , . . . ,  X„)  =  c,m,  +  +  amt 

be  a  polynomial  as  in  the  lemma.  We  have,  used  m,  to  represent  a  monomial  in  the  Xy,  so  /  has 
t  terms.  Assume  the  monomials  m,  arc  fixed  for  now.  If  f{h\)  is  zero  then  we  have  a  constraint 
on  the  Ci 

cimi(di)  +  c7m2(d|)  -| - (-  d,m,(d|)  =  0. 

Without  being  clever  in  our  choice  of  points,  we  can  only  force  t  —  1  points  to  be  zeroes  or  /. 
In  fact  it  is  known  |Lan62]  that  a  curve  of  genus  greater  than  0  has  only  a  finite  number  of 
integral  points.  Thus  for  v  =  2,  in  general,  if  D  >  2  there  is  a  constant  which  bounds  NV(B). 
The  proof  of  this  statement  is  extremely  difficult.  For  higher  numbers  of  variables  much  less 
is  known. 


Assume  now  that  the  goal  polynomial  has  v  variables  and  that  no  variable  appears  to 
degree  higher  than  d.  Each  of  the  F,  is  the  product  of  at  most  t  terms  and  each  term  is  of 
degree  at  most  d.  There  are  v  —  1  of  these  polynomials,  so  the  maximum  possible  degree  of  F 
is  (v  —  1  )td.  There  are  only  v  —  1  variables  in  F.  There  are  Bv  1  points  in  the  set  f  X  ■  ■  ■  X  f- 
Applying  the  lemma  to  F,  the  probability  that  a  point  chosen  at  random  will  be  a  zero  of  F  is 


A/i,_i(B)  _  B°~l  —  (B  —  Df~l 


B' — 1  Bv~l 


vD  v(v  —  l)D2  (v-l)(t>-2  )D3 
~  B  2  IB  +  6  B  *  + 

^  dv( v  —  l)t  ^  dv*t 

B  -  ~eT’ 


ignoring  the  higher  order  terms.  So  a  worst  case  bound  for  the  probability  that  a  point  will  be 
a  zero  of  F,  and  thus  a  bad  point  and  will  lead  to  an  erroneous  candidate  solution,  is 


t  — 


dv2t 

~T- 


Taking  logarithms  we  have 

\ogB  =  )og^Pj. 

The  following  theorem  is  the  fundamental  result  of  this  thesis  It  shows  that  the  time 
required  to  compute  a  candidate  goal  polynomial  is  polynomial  in  the  d,  v,  the  size  of  the 
candidate  and  the  size  of  e. 

Theorem  1.  Assume  d  bounds  the  degree  to  which  each  variable  in  the  goal  polynomial, 
P(X i, ....  X,,),  appears.  Also  assume  0(7’)  bounds  the  number  of  integer  arithmetic  operations 
required  to  compute  B(a \ , . . . ,  a„)  given  a|, . . . ,  a,..  Let  t  be  some  number  close  to  zero.  Then 
it  is  possible  to  compute  a  candidate  polynomial  in  time 

0({T  +  t])dvt\og2(dv2tt~1)) 


where  the  probability  that  the  candidate  polynomial  is  different  from  P  is  t  and  t  is  the  number 
of  terms  in  the  candidate. 

Proof:  To  prove  this  we  will  analyze  the  time  required  by  the  sparse  modular  algorithm.  This  will 
give  a  constructive  proof  of  the  theorem.  The  analysis  is  divided  into  two  portions.  First  we  will 
determine  the  number  of  points  at  which  the  value  of  the  goal  polynomial  must  be  determined. 
Then  the  lime  required  to  determine  the  goal  polynomial  from  its  values  is  determined. 

P{X\,ai, . . . , a,.)  is  computed  by  doing  a  dense  interpolation  of  d  -)-  1  points.  To  compute 
P(X\ ,  X-i,  a. i, . . . ,  a,,)  requires  d  more  univariate  polynomials,  each  of  which  is  computed  by  a 
sparse  interpolation.  So  we  need  (d  -f-  1)  +  dt  values  of  P  to  compute  P(X i,  X^.a^, . . .  ,a„).  In 
general  we  will  need 


(d  +  l)  +  dt  +  +  dt  =  (d  +  1)  -f  dt(v  -  1)  dot 
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interpolation  points.  Each  of  these  cost  T  integer  operations.  This  gives  the  first  term  of  the 
cost. 

We  now  look  at  algorithm  D,  which  performs  the  dense  interpolation  using  the  Chinese 
remainder  algorithm.  It  was  pointed  out  in  the  discussion  of  this  algorithm  that  /  and  q  will 
almost  always  be  dense  except  possibly  at  the  final  step.  We  will  assume  a  dense  representation 
for  these  polynomials  here.  At  the  beginning  of  step  D3,  f[x)  is  a  polynomial  of  i  —  1  terms 
and  q[x)  is  a  polynomial  of  t  terms.  Using  Horner’s  rule,  computing  f(pt)  requires  t  —  1 
additions  and  t  —  1  multiplications,  or  2 t  —  2  total  operations  Computing  q(p,)  will  take  2t 
operations.  The  addition  of  the  two  polynomials  will  take  i  operations.  Adding  these  up  we 
have  2 i  —  2  -f-  2i  -f-  i  -j-  3  =  5i  -)-  1  operations  to  compute  f(x).  Updating  of  q(x)  requires  2t 
operations.  All  in  all  this  comes  to  7i  -f-  1  operations.  Summing  this  for  i  =  2, . . . ,  k,  we  get 

in  0perations 

Algorithm  S,  which  solves  linear  equations  to  obtain  the  sparse  interpolation,  is  somewhat 
more  complex  and  we  will  make  a  few  crude  assumptions  in  analyzing  it.  We  will  assume  that 
P{X\ ,  ai,  ■ . . ,  a,,)  and  all  the  intermediate  polynomials  up  through  P(.Y|,  . . . ,  X,.)  have  t  terms. 
That  is,  the  number  of  terms  does  not  decrease  when  some  of  the  variables  are  replaced  by 
integers. 

There  are  t  linear  equations  that  must  be  solved.  This  will  take  about  f 1  operations,  so 
we  will  ignore  all  costs  that  are  dominated  by  t  \  Each  monomial  contained  in  S  is  a  product 
of  t  —  1  terms,  and  each  term  is  raised  to  degree,  at  most,  d.  Evaluation  of  a  monomial  will 
thus  cost  (i  —  l)logd  operations.  There  arc  no  more  than  t  terms  in  S,  so  step  S6  will  take 
about  (t  —  l)t  logd  operations.  Step  S6  will  be  iterated  t  times  to  produce  the  each  set  of  linear 
equations.  Thus  it  will  cost  (i  —  l)t2logd  integer  operations  to  produce  the  system  of  linear 
equations. 

There  will  be  t  independent  linear  equations  to  be  solved.  Using  straight  forward  algorithms 
this  will  take  about  C[i:>  operations.  Steps  S5  through  S7  will  be  executed  d  times  for  each 
variable  so  it  will  cost 

(t  —  1  )dt2  logd  -f-  C{dt3  C[dt3 

operations  to  produce  the  polynomials  Pi ,  •  •  • ,  Pj- 

There  will  be  t  terms  in  each  of  these  polynomials,  so  algorithm  D  will  be  run  t  times.  Since 
the  time  required  by  algorithm  D  is  independent  of  t  this  cost  of  this  step  will  be  dominated 
by  the  tJ  term  above.  Since  these  steps  are  repeated  for  each  variable  we  have  about  0(dvt3) 
integer  operations  for  the  lifting  stage  and 

ofdvtT  +  dvt3 


integer  operations  all  in  all. 

The  lemma  indicates  that  the  random  integers  must  be  chosen  from  a  set  of  about  dv2tt~l 
elements.  Arithmetic  operations  with  integers  this  large  require  at  most  log2(du2te— ')  actual 
operations.  Thus  the  total  time  required  by  the  sparse  modular  algorithm  must  be  increased 
by  this  factor.  This  proves  the  theorem.  | 
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Chapter  IV 


Some  Applications 


This  chapter  considers  some  of  the  more  general  applications  of  the  sparse  modular  algorithm 
to  multivariate  computations  in  algebraic  manipulation.  One  of  the  major  problems  that  arises 
is  intermediate  expression  swell.  Intermediate  expression  swell  can  occur  in  several  manners.  In 
the  course  of  a  computation  the  determination  of  the  answer  may  require  the  computations  of 
extraneous  variables,  terms  or  factors.  The  final  step  of  the  computation  is  usually  to  remove 
the  extraneous  information.  This  is  what  happens  in  the  compulation  of  polynomial  GCDs. 
The  final  term  in  a  Pft.S  generally  has  a  large  content  that  must  be  removed  Alternatively, 
computations  may  have  a  more  subtle,  yet  still  explosive  behavior.  For  instance,  the  perturbation 
scries  expansions  in  quantum  electrodynamics  generally  involve  sums  of  very  complex  integrals. 
The  integrals  yield  combinations  of  rational  numbers,  powers  of  ir  and  values  of  the  Ricman 
zeta  function  at  odd  positive  integral  values.  Yet  in  certain  computations  [Ben77]  these  huge 
expressions  simplify  to  simple  rational  integers. 

For  multivariate  computations  it  is  usually  not  difficult  to  carry  through  the  computation 
with  a  single  variable,  replacing  the  other  variables  by  randomly  chosen  numbers.  After 
performing  a  number  of  these  simple  computations,  their  results  can  be  interpolated  to  give 
the  desired  answer.  The  dense  interpolation  procedure  could  be  used,  but  there  is  a  problem. 
Usually,  we  do  not  have  a  very  good  bound  on  the  degree  of  the  various  variables.  Using 
the  old  interpolation  scheme,  the  simplified  problem  must  be  solved  about  (d  -f-  I)1'  times. 
As  we  commented  before,  for  large  or  moderate  d  and  moderate  v,  this  can  be  horrendously 
expensive.  Since  the  basic  computation  may  be  quite  complex  and  even  dominate  the  cost  of 
the  interpolation  it  is  especially  important  to  minimize  the  number  of  points  that  need  to  be 
examined  by  the  interpolation  algorithm. 

If  the  answer  is  not  dense,  our  interpolation  method  can  be  used  to  great  advantage  since 
it  needs  only  v(d  -f  l)f  points  from  which  to  interpolate  the  result.  Since  our  algorithm  is 
probabilistic  it  is  generally  a  good  idea  to  verify  the  answer.  For  certain  particular  applications, 
like  GOD  of  polynomials,  verification  is  not  difficult.  For  others,  like  the  evaluation  of 
determinants,  there  appears  to  he  no  good  technique.  Though  this  introduces  some  uncertainty 
into  the  computation,  the  uncertainty  can  be  controlled  by  increasing  the  range  from  which 
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the  random  points  are  chosen. 

This  chapter  will  show  how  these  ideas  may  he  applied  to  a  few  somewhat,  contrived 
examples.  In  section  1,  an  example  using  resultants  is  given.  Section  2  presents  a  few  simple 
ideas  that  show  how  rational  numbers  and  rational  functions  may  be  interpolated.  These  ideas 
would  be  extremely  useful  when  solving  systems  of  equations.  Section  3  uses  this  idea  in 
computing  a  term  in  the  inverse  of  a  matrix. 


1.  A  Resultant  Calculation 


The  application  of  the  sparse  modular  algorithm  to  many  computations  in  algebraic 
manipulation,  as  outlined  above,  is  a  fairly  straightforward  application  of  algorithm  S  of  the 
section  III. 2.  In  particular  the  computation  of  detenninanls,  the  solution  of  linear  equations 
and  the  computation  of  polynomial  resultants  fall  into  this  class.  Here  we  will  present  a  detailed 
example  involving  the  computation  of  a  piece  of  a  resultant. 

If  a  and  6  arc  relatively  prime  then  one  primitive  element  of  the  field  Q [ v/a,  \/f>]  is  "p  = 
v/a-f-  y/b-  The  minimal  polynomial  for  -p  is  known  to  be  the  resultant  of  (y  — z)r  —a  and  yr  —  b 
with  respect  to  y.  We  will  denote  this  polynomial  by  /r(z ).  For  the  first  few  r  this  polynomial 
is 

fi[x)  =  z'  —  (2 a  2 b)x2  -f-  a2  —  2ab  b2, 

ft(x)  =  z9  —  (—3a  36)z‘’  +  (3a2  -f*  21ab  -f-  3 b2)z  !  -f-  (a  —  6)'\ 

/i(z)  =  z16  —  (4a  +  46)z12  +  (6a2  —  I21a6  |-  662)zs  -f  •  -f  (a  —  6)4, 
f->(x)  =  z2S  —  ( — 5 a  -f  5 b)x2"  -{-  ( 10a 2  —  G05ab  +  -f-  •  •  -f  (a  —  h)5. 

Apparently  the  first  few  terms  of  fr(x)  arc  of  the  form 


Mx) 


zr’  -r((-I)ra  +  6)zr(r 

2 


+  (  -  ’  — ‘ 1  a2  -  rqrab  -f 


02)l 


+ 


(1) 


It  would  be  very  interesting  if  we  could  produce  an  explicit  formula  for  fr(x)  but  that  seems 
to  be  difficult.  The  first  step  in  trying  to  determine  an  explicit  formula  is  to  determine  the 
structure  of  a  few  of  the  terms  of  the  minimal  polynomial  The  first  two  terms  are  pretty  easy  to 
guess  and  the  structure  of  the  third  one  is  not  too  bad.  fri  this  sect  ion  we  will  demonstrate  how  the 
coefficient  of  the  zr(r_  21  term  can  be  determined  fairly  easily  If  we  have  a  table  of  these  terms, 
it  should  not  be  difficult  to  determine  the  explicit  form  of  the  coefficient  by  empirical  induction. 

Table  1  illustrates  how  costly  this  computation  can  he  if  fr(x)  were  computed  using  resultants. 

All  times  listed  arc  in  seconds.  This  computation  was  done  with  the  aid  of  MACSYMA  (MAC78], 

It  is  dear  that  it  would  not  be  practical  to  continue  to  compute  qr  in  this  manner,  i/^o 
will  be  determined  to  demonstrate  how  the  sparse  modular  algorithm  is  used.  We  will  compute 
fm{y)  for  several  values  of  a  and  b  and  modulo  several  different  primes.  Then  the  coefficients 
of  the  x'm  terms  will  be  interpolated  to  compute  qr  modulo  the  primes.  These  arc  combined 
via  the  Chinese  remainder  theorem  to  compute  qr- 

First  we  need  to  estimate  how  big  qm  will  be  At  each  level  in  the  table  it  appears  that 
qr  <  22’1  1  So  we  would  expect  qm  to  be  less  than  21'.  Two  primes  near  2”  will  more  than  , 

suffice.  We  will  use  pi  —  3435)3738337  and  jo  - z  34  35)3738313.  The  particular  term  of  the  ! 

resultant  in  wC.  fi  we  are  interested  appears  to  h  ive  only  51  terms  so  we  will  need  only  three  j 

points  for  a  and  b  For  simplicity  we  will  choose  (a,  />)  -  -  (1,2),  (2,  3),  (3,  4)  1 
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r 

time 

getime 

<7r 

2 

.090 

0. 

1 

3 

.167 

0. 

7 

4 

.287 

0. 

31 

5 

.665 

°-  -1 

121 

6 

2.446 

.919 

456 

7 

5.397 

1.876 

1709 

8 

11.219 

3.972 

6427 

9 

23.816 

9.709 

24301 

10 

45.620 

20.179 

92368 

11 

182.768 

113.993 

352705 

Table  1. 


The  resultant  of  ( y  —  x)20  —  1  and  y 20  —  2  modulo  pi  is 

x100  —  60x280  —  8l51507690x,fiO  —  1440118365xil0  -f-  1 18  1  6  1  2  1  225x320 

+  15871655180z,o°  -  75832 10257x280  +  11 11 154697 3x2<i0  +  68251305 1 5x240 
-f  8301575652xiin  +  1 196441 8384x2,m  +  1965613129x'80  —  909 489 47 3x' 00 

-  16580105186x‘ 10  +  1 649584594  lx 12(1  +  13937306864xlo° 

-  5608 14753 lx80  +  412265 1953x(i0  +  10757822857x,n 
+■  126440806 12x20  +  1. 

Though  this  is  large  it  is  much  smaller  and  can  be  computed  significantly  faster  than  the  general 
resultant.  Computing  this  resultant  for  (a,  b)  =  (2,  3)  and  (3,  4)  we  determine  that  the  x  m)  term 
has  9905214887  and  —14549308753  as  its  coefficient  in  each  case.  This  leads  to  the  following 
system  of  linear  equations 

—8151507690  =  I2u;,  +  12^  +  22wj 
9905214887  =  22w{  -f  2  •  3u>2  +  32«b 
—14549308753  =  32io,  -f  3  ■  4to2  -f  42io, 

The  solution  to  this  system  is  (tO|,  u>i,  toj)  =  (190, — 4075754320,190).  Proceeding  similarly 
with  pi  as  the  modulus  we  get  the  following  system  of  equations 

-8151509130  =  l2u/|  +  1  ■  2u*  +  22k$ 

9905210549  =  22u>i  +  2  •  2u>i  +  3 2toj 
-14549317411  =  32u>,  +  3  4u>2  -f  42u* 

which  has  (190,  — 4075755040, 190)  as  its  solution. 

The  190’s  are  expected;  they  correspond  precisely  with  (1).  To  compute  the  middle  coefficient 
we  apply  algorithm  C  to  the  system 

w  =  —4075754320  (mod  34359738337) 
w  =  —4075755040  (mod  34359738319) 
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This  gives  —1378461)287800  for  the  middle  coefficient,  and  68923264390  for  qr. 

Each  of  the  resultant  computations  required  about  11.5  seconds.  Extrapolating  from  smaller 
computations,  had  the  resultants  been  computed  over  the  integers,  they  would  have  required 
about  142  seconds  each.  Furthermore,  when  the  systems  of  linear  equations  were  solved,  the 
coefficients  involved  did  not  grow,  as  they  would  have  had  they  been  rational  integers. 

In  general  this  seems  to  be  the  best  practice  when  in  applying  the  sparse  modular  techniques. 
The  problem  is  solved  using  the  algorithm  several  times  over  finite  fields  and  the  Chinese 
remainder  theorem  is  used  to  reconstruct  the  answer. 

The  computation  of  o  required  about  70  seconds  of  computation.  Extrapolating  from  the 
preceding  table  it  appears  that  the  computation  of  the  full  resultant  would  take  about  10  hours, 
and  would  probably  not  fit  in  most  machines.  Though  the  resultant  gives  more  information,  for 
this  problem  we  were  only  interested  in  a  small  portion  of  the  answer  and  the  extra  information 
is  not  useful  at  this  stage. 

Furthermore,  by  repeating  this  process  all  the  terms  could  have  been  determined  if  desired, 
either  one  at  a  time  or  in  parallel  (in  all  senses  of  the  word)  without  requiring  significant  quantities 
of  storage. 

2.  Recovering  Rational  Numbers  and  Rational  Functions 

Thus  far  we  have  only  been  concerned  with  using  the  interpolation  techniques  to  determine 
polynomials  from  their  values.  In  this  section  we  will  consider  some  ideas  that  allow  the  sparse 
interpolation  scheme  to  be  used  to  determine  rational  numbers  and  rational  functions  from  their 
values. 

Let  m  be  the  ratio  of  two  small  integers  x  and  y  modulo  p  and  assume  the  product  2 xy  is 
less  than  p.  Then  there  is  an  integer  q  such  that 

my  —  pq  —  x. 

Rewriting  this  we  get 

m  q  _  x 

p  y~py' 

Since  x  is  assumed  to  be  small  q/y  must  be  a  convergent  of  the  continued  fraction  expansion  of 
m/p.  From  this  it  is  easy  to  determine  x.  This  all  all  depends  on  the  following  theorem  which 
is  proved  in  (Har68]  (theorem  184). 

Theorem  2.  If  p/q  is  rational  number  approximation  to  f  which  satisfies 


the  p/q  is  a  convergent  of  the  continued  fraction  of 

Algorithm  T  is  given  two  numbers  m  and  p  and  computes  a  pair  of  integers  x  and  y  such 
that  x  =  my  (mod  p).  This  done  by  computing  the  continued  fraction  of  m/p  whose  partial 
quotients  are  denoted  by  a,  and  convcrgcnts  by  qi/pi- 

Tl.  (Initialize)  Set  q~\  *—  I,  qn  <—  0,  y_i  «—  0  and  yn  «—  1.  Set  r  «—  m  and  a  *—  p.  Finally  set 
i  *—  1. 

T2.  [Partial  quotient)  Set  a,  to  the  integer  quotient  of  r  and  s.  Then  set  r  to  «  and  a  to 
the  remainder  of  r  divided  by  a. 
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T3.  [Next  term  of  CF]  Set  g,  *-  a, q, __ t  -f  q^i  and  y*  «—  o, y,__i  -(-  Vi— 2- 
T4.  [Enough?]  If  y  2  <  2 p  then  set  t  *—  t  +  1  and  go  to  step  T2. 

T5.  [End]  Return  the  fraction  (my,_i  — _ i )/9i _ i - 

This  algorithm  allows  us  to  determine  a  rational  number  from  its  residue  modulo  a  prime. 
A  similar  algorithm  can  be  used  to  determine  a  rational  function  from  its  value  modulo  a  ideal 
(p(x)).  If  p(x)  —  x",  then  its  value  is  a  truncated  power  series.  In  this  case  the  rational  function 
that  this  algorithm  returns  is  known  as  a  Fade  approximation.  A  recent  study  of  this  idea  is 
contained  in  [McE78], 

If  there  are  several  variables  in  the  rational  function  that  we  arc  trying  to  determine,  then 
we  can  replace  all  but  one  of  the  variables  by  an  integer,  compute  the  power  series  expansion  of 
the  answer  in  terms  of  the  other  variable,  and  then  interpolate  the  coefficients.  This  technique  is 
quite  satisfactory  and  is  particularly  appropriate  to  problems  like  solving  differential  equations 
where  it  is  impossible  to  replace  all  the  variables  by  constants. 

There  is  another  interesting  idea  that  can  be  used  if  even  computing  with  power  series  in 
one  variable  is  impractical.  The  basic  idea  is  to  compute  the  rational  function  fig  by  taking 
its  values,  r\ , . . . , n -,  modulo  pi , . . . , p*  and  converting  them  to  fractions,  f\/g\}. . fk/Ok-  (It  is 
assumed  that  2/,y,  <  p,  .)  The  numerator  and  denominator  can  then  be  separately  interpolated. 
There  is  a  slight  complication.  Though  the  rational  numbers  that  are  computed  arc  the  values 
of  f/g  at  the  point  in  question,  /  will  not  be  the  value  of  /  unless  the  values  of  /  and  g  are 
relatively  prime.  As  far  as  l  can  see  there  is  no  good  way  to  ensure  this,  and  there  also  does  not 
seem  to  be  an  easy  way  to  analyze  the  probabilities  involved.  Empirically,  however  this  seems 
to  be  a  quite  reasonable  way  to  determine  /  and  g.  It  is  perhaps  best  used  as  a  heuristic.  If 
the  polynomials  that  are  produced  turn  out  to  be  too  dense,  they  are  discarded  and  new  points 
are  chosen  on  the  assumption  that  there  was  a  GCD  between  the  values  of  the  numerator  and 
denominator. 

There  is  another  problem,  though  it  is  easily  overcome.  Generally,  the  p,  are  chosen  to  be 
just  smaller  than  a  machine  word  for  efficiency  considerations.  The  values  of  /  and  g  can  be 
quite  large,  especially  in  view  of  the  possibility  of  the  values  of  /  and  g  having  a  nontrivial 
GCD,  so  2 \figi  <  Pi  may  be  false.  This  problem  can  be  avoided  by  combining  the  several  of  the 
r,  via  the  Chinese  remainder  theorem  before  applying  algorithm  T.  This  will  yield  the  value  of 
f/g  modulo  PiPj. 

3.  Determinants 

In  this  example  we  will  demonstrate  how  the  sparse  algorithm  could  be  used  to  compute 
an  element  in  the  inverse  of  a  matrix.  These  techniques  may  also  be  applied  to  compute  the 
solution  of  a  system  of  linear  equations.  The  matrix  we  will  consider  is  actually  quite  small, 
but  will  serve  to  illustrate  our  methods.  This  example  is  from  [Wan76]. 

/x2  y2  0  u2  0 

x1  y1  0  0  0 

_  I  1111 

~  x5  0  0  0  0 

x;>  y3  0  0  0 

\x  y  0  u  u 

We  will  be  trying  to  determine  the  (3,  I)  component  of  inverse  of  this  matrix. 


0 

1 

0 

23 

Z 


(7) 
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Almost  certainly,  the  elements  of  the  inverse  of  A b  will  be,  non-polynomial,  rational 
functions.  We  will  use  the  techniques  of  the  previous  section  solve  this  problem. 

The  degree  bounds  for  the  determinant  of  -4b  are  easy  to  compute.  Since  each  of  the 
variables  appears  in  precisely  one  column  in  the  determinant,  no  variable  appears  to  higher 
degree  than  it  appears  in  the  matrix.  If  we  compute  the  inverse  of  J&>  modulo  34359738337, 
with  random  integers  substituted  for  z,  y,  z,  u  and  v,  we  discover  that  the  (3, 1)  coefficient  is 
independent  of  z,  y  and  z.  The  following  table  gives  the  values  of  this  coefficient  for  various 
values  of  u  and  v  and  what  algorithm  T  yields  for  f/g. 


u 

V 

Jt>3| 

h 

9i 

m 

■1 

-2 

-2 

m 

m 

14316557640 

-5 

■fit 

3 

m 

6681060232 

-7 

36 

-2 

3 

2863311528 

-1 

12 

4 

7 

3374617158 

112 

—2 

-5 

-15461882252 

7 

-20 

The  fi/gi  can  be  easily  combined  by  the  results  already  presented  to  compute  the  exact  value 


^11>31 


— u  —  u 
u2t/ 


Again  each  clement  of  the  inverse  could  be  computed  independently  (and  in  parallel)  if  the 
value  of  the  entire  matrix  were  desired. 
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Chapter  V 


Ideal-adic  Arithmetic 


As  we  shall  see,  Hensel’s  lemma  and  the  algorithms  based  on  it  arc  clever  applications  of 
Newton’s  iteration  to  problems  over  particular  domains.  In  this  chapter  we  present  Newton’s 
method  in  its  familiar  form  for  equations  over  the  reals.  We  will  show  how  this  may  be  extended 
to  coupled  systems  of  equations  and  analyze  its  convergence  properties. 

Although  it  is  possible  to  apply  Newton’s  method  directly  to  polynomial  problems  it  haB 
proven  to  be  easier  to  apply  when  the  coefficients  of  the  polynomials  lie  in  some  p-adic  domain. 
In  section  2  we  present  a  synopsis  of  the  basic  results  about  p-adic  numbers  and  present  some 
examples. 

In  section  3  we  generalize  this  slightly,  and  consider  m-adic  domains,  where  m  is  some  ideal 
contained  in  a  commutative  noctherian  ring.  The  initial  presentation  is  kept  quite  abstract 
so  that  the  results  can  be  carried  over  to  algebraic  number  rings  and  other  rather  complex 
structures.  However,  a  number  of  examples  are  presented  which  should  help  to  clarify  the  ideas 
involved. 

Virtually  nothing  in  this  chapter  is  new  and  the  mathematically  sophisticated  reader  can 
safely  skip  it,  referring  back  to  it  for  notation  as  needed  in  the  reading  of  later  sections.  Hopefully 
we  have  presented  enough  of  the  basic  concepts  to  allow  those  with  other  backgrounds  to  follow 
the  bulk  of  these  results. 


1.  Newton's  Iteration 

Assume  we  are  trying  to  find  the  zeroes  of  the  function  f(z).  If  we  have  some  idea  were  a 
solution  of  this  equation  is,  then  it  is  possible  to  use  the  Ncwton-Jtaphson  iteration  to  refine 
our  guess  to  a  zero  of  f(z).  Often  the  guess  need  not  be  very  close  at  all  for  this  iteration  to 
converge. 

Assume  x *  is  a  real  number  close  to  a  zero  of  f(x)  which  we  call  x.  We  next  try  to  find  a 
better  approximation  i*  +  |.  Expanding  f{xj+\)  as  a  power  scries  in  (ifcf  i  —  it)  by  Taylor’s 
formula  we  have 

f(ik+ 1  )  =  f(tk)  +  (**+l  —  **)/(**)  H - 
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We  assume  that  J(xk+i)  is  close  to  zero  so  that  we  may  replace  the  left  hand  side  of  this  equation 
by  zero.  We  will  also  ignore  the  higher  order  terms  on  the  right  hand  side.  All  these  assumptions 
leave  us  with  a  simple  linear  equation  which  can  be  solved  for  xk+{. 

_  /(») 

*‘+l  ~  ~  fW 

This  is  the  familiar  form  of  the  Ncwton-Raphson  iteration.  If  the  starting  point  Xf,  is  sufficiently 
well  chosen  then  the  sequence  {  xk  }  will  converge  to  a  root  of  f(z).  This  iteration  converges 
quadratically  to  the  roots  of  f(x). 

2.  p-adic  Numbers 

The  key  problem  in  using  Newton’s  method  to  solve  a  system  of  equations  is  determining 
a  “good  point”  at  which  to  begin  the  iteration.  Once  such  a  point  has  been  found,  the  iteration 
will  quickly  converge  to  a  solution  of  the  system.  The  "goodness”  of  a  point  is  directly  related  to 
its  closeness  to  a  zero  of  the  system  of  equations,  in  the  last  section  all  arithmetic  was  performed 
with  real  numbers  Consequently,  the  closeness  of  two  numbers  was  measured  by  determining 
the  absolute  value  of  their  difference.  To  get  a  good  initial  approximation  to  a  solution  we  must 
have  some  idea  of  the  magnitude  of  the  solutions.  In  physical  problems  this  information  can 
often  be  determined  from  the  physical  constraints  on  the  problem. 

For  problems  in  algebraic  manipulation  it  is  often  easy  to  solve  the  system  of  equations 
exactly  modulo  some  prime.  At  worst  we  can  exhaustively  search  the  (finite)  set  of  all  possible 
solutions.  For  many  problems  it  is  possible  to  do  much  better  than  this.  Thus  we  would  like 
to  modify  the  concept  of  distance  somehow  so  that  a  solution  to  the  system  modulo  a  prime 
yields  a  “good"  starting  point. 

This  modification  leads  to  the  p-adic  numbers,  which  we  examine  in  this  section.  The 
basic  ideas  behind  p-adic  numbers  were  originally  developed  by  llenscl  [Hen  18]  and  applied  to 
problems  in  number  theory.  Over  the  years  the  p-adic  outlook  has  permeated  large  portions  of 
algebraic  geometry  and  number  theory  and  is  now  considered  to  be  one  of  the  cornerstones  of 
modern  mathematics. 

What  we  are  looking  for  is  a  new  measure  of  distance  that  can  be  applied  to  integers.  Since 
it  is  possible  to  compute  solutions  to  problems  modulo  some  prime,  we  would  like  to  say  that 
two  numbers  that  have  the  same  residue  for  a  given  modulus  are  close.  Thus  this  new  distance 
measure  must  indicate  that  I  and  4  and  163  are  close  for  a  modulus  of  3.  The  first  test  for 
proximity  of  two  numbers  is  that  their  residue  modulo  3  is  the  same. 

It  will  be  easier  to  continue  if  we  restrict  ourselves  to  proximity  to  zero.  This  is  not  much 
of  a  restriction  since  the  distance  between  two  integers  p  and  q,  should  also  be  the  distance 
between  0  and  p  —  q. 

So  we  now  ask  how  to  measure  the  distance  between  an  integer  p  and  0.  Assume  that  p 
is  congruent  to  0  modulo  3.  Then  p  is  divisible  by  3  and  we  can  ask  how  close  p/3  is  to  zero. 
Clearly  if  p/3  were  also  congruent  to  zero  then  p  would  be  "closer”  to  zero  than  if  p/3  0 

(mod  3).  After  all  0/3  =  0  (mod  3). 

If  we  write  the  number  163  in  the  base  3  system  we  have 

l  +  0  •  3  -f  0  34  +  0  •  3;)  +  2  •  34. 

Clearly  this  number  is  quite  close  to  1,  since  the  difference  of  163  and  1  is  divisible  by  a  large 
power  of  3. 
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We  can  now  define  the  distance  measure  as  follows.  Let  a  be  an  integer,  and  assume  3r 
divides  a  but  3r  +  '  does  not.  We  can  write  a  as  3 rp.  The  the  3-adic  valuation  of  a  is 

IMl3  =  |{3rp||  j  =  3-'. 

When  p  =  0  we  will  say  that  r  =  oo. 

We  can  extend  this  definition  to  rational  numbers  also.  All  rational  numbers  can  be  written 
in  the  form  3 rp/g  where  3  does  not  divide  p  or  q,  and  r  can  be  positive,  negative  or  zero.  Then 

P'p/fllla  =  3-L 

This  definition  can  be  extended  in  the  obvious  manner  to  other  primes.  In  general  these  distance 
measures  are  called  p-adic  valuations. 

In  elementary  analysis  and  topology,  absolute  values  are  used  to  define  convergent  sequences 
and  the  convergent  sequences  are  used  to  “complete”  the  rational  numbers  to  yield  the  reals. 
We  will  now  use  the  p-adic  valuation  to  complete  the  integers  to  give  the  p-adic  integers,  Zp. 
The  basic  idea  is  quite  simple.  Let 

,  A  =  {  0(1,  0| ,  02,  .  .  •  ,  Ofc,  . . . } 

be  a  sequence  of  rational  integers  such  that 

l|Q«  -  ajWp  =  P~  ,ni"(‘,i). 

Thusoi— a,_!  =  a,p’.  The  we  can  view  {  0o,0|, . . . ,  a*  }  to  be  a  sequence  of  increasingly  better 
approximations  of  afc.  The  whole  set  is  thus  a  sequence  of  increasingly  good  approximations  to 

lim  Qk- 
k—*o o 

This  limit  is  often  represented  by  the  infinite  series 

«o  +  0|  ’  P  +  <*2  '  P2  H - •  (8) 

The  a,  are  chosen  from  the  range  0  <  a,<  p.  This  is  called  the  power  scries  representation  for 
a  p-adic  integer. 

It  is  clear  that  the  positive  integers  have  a  unique  representation  of  this  form.  The  p-adic 
integers  also  include  quantities  for  which  (8)  is  an  infinite  series.  Assume  that  o  =  {  do,  Q||  •  •  • } 
and  (3  arc  two  p-adic  integers  such  that  a  k  =  A  (mod  pk),  for  all  k\  that  is,  ||Qfc  — A  lip  =  pfc+1- 
Then  we  say  that  a  and  are  equal.  If  this  is  the  case  then  they  must  have  the  same  power 
series  representations.  Prom  now  on  we  will  only  work  with  the  power  scries  representations  of 
p-adic  integers. 

Negative  integers  also  have  p-adic  representations.  Consider  the  number  — 1.  One  sequence 
of  good  approximations  by  positive  integers  is  pk  —  I.  But  this  can  be  written  in  the  following 
manner 

pk  -  1  =  (p  -  l)(pfc-'  -}-  pk~2  +  •  •  •  +  p  +  1) 

(p  -  l)  +  (p  - 1)  p  -f  (p  -  l)  •  p2  H - 1-  (p  -  l)  •  p*_i 

Taking  the  limit  as  k  — *  oo  we  have 

-1  =  (p  -  1)  -|-  (p  -  1)  •  p  +  (p  -  1)  •  p2  +  •  ■ 
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It  is  easy  to  sec  that  this  is  one  less  than  zero: 

-l+l  =  l+(p-l)  +  (p-l)p  +  (p—l)-p2  +  -.- 

=  p  +  (p  —  1)  •  p  H — 

=  p2  +  (p  —  l)  V  H — 

=  0. 

Arithmetic  with  p-adic  integers  is  performed  in  much  the  same  manner  as  power  series 
arithmetic.  The  only  difference  is  that  a  normalization  step  must  be  performed  to  ensure  the 
coefficients  of  p*  lie  in  the  range  0  <  a*,  <  p.  For  instance,  to  compute 

q2  =  (1  +  3  +  32  +  2  •  3<  4-  •  •  )2 

in  the  3-adic  integers  we  begin  by  performing  the  power  series  multiplication. 

q1  =  1  +  2  •  3  +  3  •  32  +  2  3*  +  5  •  3»  +  •  •  • 

=  1  +  2  •  3  -f  (1  +  2)  •  3:)  +  5  ■  34  +  •  •  • 

=  l  +  2-3  +  (l  +  5)-34  +  .-. 

=  1  -h  2  -  3  -} - 

The  p-adic  integers  clearly  include  all  of  Z  since  the  positive  integers  and  — I  are  p-adic 
integers.  There  are  other  elements  also.  For  instance, 

-hrb  =  ,  +  3+3J+33+ - 

So,  —1/2  is  an  clement  of  Zj. 

Certain  algebraic  numbers  also  lie  in  Z;,.  We  will  now  show  that  X2  —  7  has  a  solution 

that  lies  in  Z.».  Let  the  symbol  s/l  represent  a  solution  of  X 2  —  7.  If  \/7  does  lie  in  Z3  then 

we  must  have 

y/7  =  oo  “f"  °i  ■  3  -f-  Ui  ■  32  -f-  • 

Therefore,  we  must  have  oq  —  7=0  (mod  3),  so  ao  is  either  1  or  — 1.  We  pick  1;  using  the 
other  solution  of  X2  —  7  using  identical  steps. 

As  with  the  computation  of  a<),  the  computation  of  the  other  proceeds  by  considering 
the  equation 

(oo  +  o,  •  3  4.  •  •  •  }2  -  7  SB  0  (mod  3<+1) 
and  solving  the  resulting  equation  for  a,.  Thus  to  determine  ai, 

(1  -f-  a(  •  3  -| - )2  —  7  =  1  +  2a|  •  3  —  7  =  0  (mod  32) 

So  ai  =  1  and  >/7  =  1  -(-  3  ■  •  ■.  Continuing 

7  =  (1  +  3  +  a2  32)2  =  16  +  8 ai  ■  32  (mod  33) 
so  =  1.  This  may  continued  arbitrarily  far: 

v/7  =  1  -f  3  +  32  -f  2  ■  34  -f  •  •  •. 

Notice  that  the  equations  for  a  1  and  were  both  linear.  This  is  generally  the  case. 
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3. 


m-adic  Arithmetic 


This  section  requires  a  bit  more  mathematical  sophistication  then  the  earlier  chapters  and  a 
bit  deeper  understanding  of  the  concepts  of  abstract  algebra.  However,  we  use  this  sophistication 
only  to  obtain  more  general  algorithms,  the  reader  could  merely  interpret  most  everything  we 
say  using  p-adic  integers  and  would  lose  little  in  terms  of  content. 

Let  ft  be  a  commutative  ring,  m  an  ideal  of  R.  Then  there  is  a  sequence  of  rings  and 
canonical  homomorphism  as  follows. 

in  ^  n  /  2  ^3  0)  mi 

R/m  1 - /?/ m  « - -  « - R/ m  «  —  •  •  (3) 

Consider  the  set  —  R/m  X  R/m2  X  •  and  an  element  of  («,)  G  where  a,  G  R/ m‘.  An 
element  of  is  said  to  be  coherent  if  0, .|.|a,  +  |  =  a,  for  all  i.  The  set  of  all  coherent  elements 
of  %  is  a  ring.  This  ring  is  called  the  inverse  limit  of  of  the  sequence  and  we  write 

RtJ1  —  lim  R/m'. 

R„l  is  said  to  be  the  m-adic  completion  of  R.  The  last  section  dealt  with  the  case  R  =  Z.  The 
only  ideals  are  the  principal  ones  generated  by  a  rational  integer,  m.  The  sequence  of  rings  in 
this  case  is  Z/(m)‘  =  Z/(m'). 

For  example,  \/l  is  represented  in  Qj  the  coherent  sequence 

>/7  =  (1,4,13, 13, 175,...)  =  a. 

If  we  write  this  sequence  as  a  sequence  of  differences  we  have  the  familiar  form,  ( l,  3, 9, 162, . . . ). 
Unfortunately,  the  subtractions  we  just  performed  have  to  be  made  precise.  We  subtracted  an 
element  of  R/(m')  from  an  element  of  R/(m,+l).  Thus  we  need  an  embedding  of  f?/(m‘)  into 
R/(m’+>),  i.e.  the  inverse  of  0,-|_|.  There  is  no  unique  inverse.  Before  resolving  this  problem, 
we  want  to  examine  one  additional  example  to  bolster  our  intuition. 

Let  F  be  a  field  and  consider  the  ring  R  =  F[x,  y,z\.  The  ideal  m  =  [x,y,z)  is  maximal, 
so  R/m  is  a  field.  Intuitively,  the  m-adic  completion  of  R  should  be  the  ring  of  power  series 
in  x,  y  and  z,  F\\x,y,z\\.  This  is  in  fact  what  happens.  The  first  ring  of  (3)  is  F.  The  next 
ring  is  R/m  which  is  the  ring  of  power  scries  in  x,  y  and  z  truncated  after  first  order.  Since 
(x,  y,z)2  =  (x2,yl,z2,xy,xz,yz),  the  next  ring  is  the  truncated  power  series  ring  of  order  2. 
Clearly  the  inverse  limit  is  the  general  power  series  ring.  The  map  6 i  merely  truncates  the  power 
series  to  order  i  —  1. 

Again  the  inverse  map  is  not  unique,  but  since  there  is  an  natural  embedding  of  the  ring 
of  power  scries  of  order  i  into  the  ring  of  order  t  1,  0i+l  has  a  "natural  inverse.”  Notice  that 
we  can  extend  0,  to  a  map  from  RtJ1  onto  R/m'~l. 

We  would  like  to  write 

q  =  Qo  -j-  Oj  •  m  -f  02  •  m2  H - . 

So,  in  some  sense,  a,  m*  a  the  projection  of  a  to  the  “subspacc  spanned  by  jnl  There  are  only 
two  conditions  which  need  to  be  satisfied  by  a*  •  m* 

(1)  Qj  m'  G  mlRm 

(2) a-EL()Q.miGm'+l 

The  first  property  ensures  that,  under  the  m-adic  topology,  the  o,  m’  get  smaller  and 
smaller,  i.e.  their  sum  will  converge.  By  the  second  condition  their  sum  will  converge  to  a.  This 
is  all  we  need.  The  natural  mappings,  if  they  can  be  determined,  arc  suitable. 


Chapter  VI 


Hensel’s  Lemma 


In  its  most  common  form,  Ilensel’s  lemma  indicates  when  a  factorization  of  a  polynomial 
over  a  ring  R/ m  can  be  lifted  to  a  factorization  over  the  ring  RIJX.  Lately,  it  has  been 
used  to  lift  factorizations  modulo  a  prime  to  factorizations  over  the  integers  and  then  to 
multivariate  factorizations.  It  is  also  used  to  lift  GCD  and  other  computations  useful  in  algebraic 
manipulation. 

In  this  chapter  we  will  present  a  “new”  framework  for  discussing  computational  formulations 
of  Hensel’s  lemma.  Actually,  the  formulation  we  use  is  fairly  well  known  in  mathematical  circles. 
In  the  first  section  we  present  the  new  framework  and  show  how  to  solve  simple  problems  with 
it  without  making  any  particular  appeal  to  Ilcnsel’s  lemma.  In  the  second  section  we  prove 
ilenscl’s  lemma,  in  its  more  familiar  form,  using  the  techniques  of  the  first  section.  The  final 
section  presents  the  formulation  that  been  used  in  algebraic  manipulation  circles  and  discusses 
its  similarities  and  differences  with  our  formulation. 


1.  A  New  Formulation  of  Hensel's  Lemma 

Our  version  of  Hensel’s  lemma  uses  an  m.-adic  version  of  Newton’s  iteration  to  obtain  the 
zeroes  of  a  system  of  equations  in  an  m-adic  field.  The  first  part  of  this  section  converts  a 
number  of  well  known  problems,  and  a  few  problems  not  so  often  considered,  into  systems  of 
equations  which  must  be  solved.  In  the  second  part  we  show  how  to  apply  Newton’s  iteration 
to  systems  of  equations  over  m-adic  fields.  We  concentrate  on  a  linearly  convergent  algorithm 
since  it  is  rather  simple,  but  in  the  final  section  we  also  discuss  the  quadratic  algorithm  and 
point  out  its  draw-backs. 

1.1.  Reducing  Problems  to  Solving  Equations 

Both  factoring  polynomials  and  computing  their  GCD’s  are  included  among  the  problems 
which  can  be  recast  as  systems  of  coupled,  non-linear  equations.  By  reducing  these  and  other 
problems  to  systems  of  equations  we  will  obviate  the  individual  analyses  which  were  previously 
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required.  The  algorithms  produced  using  this  method  benefit  from  all  the  advantages  ascribed 
to  the  “Henscl  lemma  technique”  and  our  improvements  to  it. 

The  first  problem  we  consider  does  not  appear  to  have  been  studied  much  but  is  interesting 
nonetheless.  As  we  shall  sec  it  leads  to  a  useful  heuristic  for  solving  quintic  equations. 

Assume  f{z)  is  a  monic  polynomial  over  Z.  We  want  to  know  if  f[z)  has  a  polynomial  divisor 
of  a  given  degree  over  Z.  In  its  most  common  form  this  problem  asks  whether  f(z)  has  any  linear 
factors  and  thus  possesses  integral  zeroes.  By  determining  if  f(z)  has  any  quadratic  factors  we 
will  be  able  to  determine  if  f(z)  has  any  zeroes  in  a  quadratic  number  field. 

To  illustrate  this  technique  assume 

f(z)  =  z'  +  C\ZA  4-  CtZ3  +  C;)22  +  C.\Z  -f  C5 

and  that  we  are  looking  for  a  quadratic  factor.  (Notice  that  if  we  can  find  a  quadratic  factor 
then  what  will  be  left  is  a  cubic  polynomial.  Thus  this  could  form  the  basic  of  a  simple  algorithm 
for  computing  the  roots  of  certain  types  of  quintic  polynomials.)  Any  such  quadratic  factor 
will  be  monic  and  thus  of  the  form:  g[z)  =  zl  -f-  az  b.  Since  g(z)  divides  f(z),  the  remainder 
must  be  zero.  Dividing  f(z)  by  g{z)  we  get  a  remainder  of 

(o'1  —  C|03  -f  c2a 2  —  Cjo  -f  b2  —  c-ib  —  3 a2b  +  2c,  ab  -f  c4]z 

a3b  —  2 ab2  —  c,a26  4-  c,b2  4-  Cgofc  —  cjb  4~  C5. 


For  this  to  be  zero  the  coefficients  of  both  the  linear  and  constant  terms  must  be  zero.  We  thus 
get  the  following  two  equations  in  the  unknowns  a  and  b. 

a1  —  C|0  *  4-  ci °2  —  c3a  4"  h2  —  Cjb  —  3 a2b  4-  2c,  a6  4"  c\  —  0 
a3b  —  2abl  —  c,alb  4 -  c,b2  4"  czub  —  O b  4~  C5  =  0 


(1) 


Any  solution  of  this  system  of  equations  will  yield  a  quadratic  polynomial  which  divides  f[z). 

From  the  classical  theory  of  equations,  we  know  that  all  quintic  equation,  over  the  rationals, 
can  be  transformed  into  the  Bring-Jcrrard  form  z'  -\~c,z  4-C5  =  0.  Assuming  c ,  —  C2  =  C3  ==  0 
equations  (1)  simplify  considerably.  By  elimination  theory  we  find  that  a  and  b  arc  zeroes  of 


the  following  polynomials 


a10  —  3c.ia°  4-  UC5O5  —  4c2, a2  —  4c.,c5a  4*  £5, 
b'°  -  c.,68  -  c26°  -  2c265  4-  c-y  4-  c4clb3  4-  c\. 


Consider  now  the  general  case,  f[z)  is  a  polynomial  of  arbitrary  degree  and  we  are  looking 
for  a  polynomial,  g(z),  which  divides  f(z)  and  is  of  degree  n.  The  remainder  of  f(z)  when  divided 
by  g(z)  is  of  degree  n  —  l.  Equating  each  to  zero,  wc  get  n  equations.  There  were  n  unknown 
coefficients  in  g(z)  since  wc  know  g(z)  is  monic. 

If  f(z)  is  not  monic  we  can  still  use  this  procedure.  Using  an  additional  indeterminate  for 
the  leading  coefficient  will  not  work  since  there  doesn’t  seem  to  be  an  additional  equation  lying 
around.  Instead  we  monicizc  the  polynomial.  Let 

f(z)  =  coz"  4-  ci*"-1  4 - h  c*» 

Wc  could  monicizc  f(z)  by  dividing  it  by  q,  but  this  would  introduce  fractions  that  would  be 
difficult  to  handle.  Then 

Co  7(z)  -  M”  4-  Web*)"-'  +  c „<r  *  =  /M- 

Since  f(z)  is  monic  we  can  use  the  above  procedure  with  /  and  then  adjust  the  leading  coefficient. 

On  the  other  hand  ir  we  knew  the  leading  coefficient  or  g(z),  or  for  that  matter  any  coefficient 
of  g(z),  then  we  would  have  enough  equations  to  solve  the  system. 
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Assume  wc  have  a  monic  polynomial 


which  wc  are  trying  to  factor  into  two  polynomials  g  and  h  of  degrees  r  and  s  respectively.  Then 
we  have 

fix)  =  W  4"  9\xT  1  4  ' ' '  4  9r)[x ’  4  h\x3  1  4  •  •  •  4  ha) 

Multiplying  out  the  polynomials  and  equating  the  coefficients  of  x  gives  the  following  system 
of  equations. 

9i  4  =  /i 

9i  4  4  hi  =  h 


(1) 


9rk ^ — |  4  9r — |ha  —  fn — | 

9tK  =  /„ 

The  fi  are  known  so  we  have  n  equations  in  r  -f  s  =  n  unknowns.  Since  there  are  the  same 
number  of  unknowns  as  equations  wc  have  reason  to  believe  that  there  is  are  at  most  a  finite 
number  of  solutions  to  the  system  of  equations. 

Notice  that  had  we  not  assumed  that  /  was  monic  then  there  would  have  been  an  additional 
equation 

fllAt  =  A- 


Unfortunately,  there  are  two  additional  unknowns.  The  system  would  not  have  had  a  finite 
number  of  solutions.  However,  if  one  of  the  g,  or  h,  is  known  then  the  solution  set  would  be 
finite.  (This  is  a  slight  generalization  of  Henscl’s  Lemma.) 

If  it  was  known  that  f(x)  could  be  expressed  as  the  product  of  more  than  two  factors  then 
one  could  use  the  technique  sketched  above  recursively.  However  it  is  possible  to  set  up  the 
systems  of  equations  for  more  than  two  factors  without  much  difficulty  For  instance,  assume 
f(r)  is  known  to  be  the  product  of  three  factors  A(x)B(x)C(x)  of  degrees  r\ ,  r2  and  r  >  respectively. 
Wc  then  know  that 


f{x)  =  A(x)B(x)C[x) 

Xn  A - f  fn  =  (xr'  -f - b  ar,)(zn  H - b  iViX*”3  -I - b  4>) 

=  ir'+^+r3  4  aix^+^  4  6,xr,+r *  -b  cxxr'+r>  4-  •  •  •  +ar,6rjc^ 


Again  we  equate  the  coefficients  x’  on  the  right  hand  side  of  the  equation  with  those  on  the  left 
hand  side.  There  are  n  =  ri  4  r2  -f-  r:)  equations  and  an  equal  number  of  unknowns,  a*,  bj  and 
0.  so  wc  seem  to  have  won  again. 

Recall  that  the  square-free  decomposition  of  a  polynomial  f(z)  is  a  factorization  into  the 
form 

![z)  =  Pdz)Pl(z)2PM>-  Pn(*r 

where  each  of  the  P,  is  square-free  and  every  pair,  P,  and  PJt  is  relatively  prime.  The  classical 
way  to  compute  the  square-free  decomposition  of  a  polynomial  is  to  first  compute  the  GCD  of 

/(*)  fi2)- 

gcd(/(*),/(*))  =  P2{z)Pl(z)  ■  Pn[z)n 

This  allows  us  to  determine  P\{z)  and  wc  can  repeat  the  process  with  the  GCD  to  compute  P2 
and  so  on.  4 


47 


For  multivariate  polynomials,  the  first  GCD  is  extremely  costly.  Trager  and  Wang  [Tra79] 
have  suggested  using  an  evaluation  homomorphism  to  reduce  to  a  univariate  polynomial, 
computing  the  square  free  decomposition  using  univariate  GCDs  and  then  using  llenscl’s  lemma 
to  get  the  square-free  factorization.  Using  our  formulation,  all  that  is  necessary  to  lift  the 
univariate  factorization  is  to  set  up  a  system  of  equations  that  must  be  solved.  For  n  —  2,  this 
is  quite  easy. 


}{x)  =  A(x)B(x)2 

xn  +  ■••+/.,  =  (*r|  4  ■  ■  4  ari)(zri  +  -  •  -  +  6r,)2 

=  *r' +2ri  4  (a,  +  2  b,)xr'  +2’*-1  +  •  •  -h  Orft 

For  larger  n,  similar  results  are  easily  obtained. 

The  GCD  problem  can  be  handled  in  a  very  similar  manner.  Assume  we  want  to  compute 
H  which  is  the  GCD  of  F  and  G.  Again  we  will  treat  all  polynomials  as  polynomials  in  x  with 
coefficients  which  are  to  be  determined.  Letting  F  =  All  and  G  —  1)11  we  get  a  system  of 
equations  which  is  similar  to  the  system  which  was  derived  for  a  factorization  (1),  To  obtain  a 
solution  to  this  system  we  will  also  have  to  obtain  values  for  A  and  B. 

It  would  be  interesting  to  see  if  it  is  possible  to  compute  H  without  computing  the  cofactors. 
It  is  possible  to  set  up  a  system  of  equations  which  does  this  but  in  practice  it  will  probably 
prove  too  expensive.  Since  H  divides  F,  the  remainder  of  F  when  divided  by  II  must  be  zero. 
This  gives  one  set  of  equations  which  the  coefficients  of  II  must  satisfy.  In  fact  it  is  possible  to 
do  the  same  with  G  and  produce  2  systems  of  equations  which  //  must  satisfy.  For  instance 
with  F  and  G  monic,  of  degree  4  and  H  quadratic,  we  get  the  following  system. 


2h, hz  -  f\h‘i  -  h\  +  M2  -  M,  -f  h  =  0 

h,2  —  /i2/i-2  4  /ih,h'2  ~  fih‘2  4  f\  —  0 

2h\hi  —  g{h2  —  h]  4  —  gzh\  4  S»  =  0 

h'2  —  hffo  4  9\h\hi  —  gzhz  4  9i  —  d 


Remember  that  only  /i,  and  hz  are  variables  here.  Everything  else  is  determined  a  priori.  At 
most  two  of  these  equations  will  be  independent. 

I  have  been  unable  to  produce  a  similar  set  of  equations  for  resultants.  All  the  systems 
I  have  managed  to  produce  have  had  fewer  equations  than  unknowns.  It  seems  unlikely  that 
a  system  of  equations  which  has  only  a  finite  number  of  solutions  exists,  but  a  proof  seems 
difficult.  From  a  practical  point  of  view  this  docs  not  really  matter  since  the  sparse  modular 
lifling  algorithm  can  be  used  to  compute  resultants.  Nonetheless  it  is  still  curious  that  no  such 
system  seems  to  exist. 

1.2.  Solving  the  Systems  of  Equations 

Now  that  we  know  how  to  set  up  systems  of  equations,  we  need  to  show  that  there  actually 
is  an  easy  way  to  solve  the  resulting  systems  of  equations.  We  will  use  the  m-adic  language 
developed  in  section  VI. 3  throughout  this  section. 

We  assume  that  II  is  a  commutative  ring  with  unit  and  that  m  is  an  ideal  of  ft.  For  simplicity, 
we  begin  with  the  single  variable  version  of  Newton’s  iteration.  Let  f(X)  be  a  polynomial  in  X 
over  It.  Let  x,  he  a  coherent  sequence  of  values  in  ft  such  that  /(x*)  s  0  (mod  Clearly 
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xk  converges  to  an  element  of  RU1,  r.  If  we  arc  lucky,  the  limit  will  also  lie  in  R.  There  is  a 
canonical  embedding  of  R  in  RiA1  so  we  might  as  well  consider  /(A  )  to  be  a  polynomial  over  RISL 
r  can  be  written  as  a  series 


r  —  ro  +  r,  •  m  +  r  2  ■  m  -| - , 

where  rk  •  jnk  =  (zt  —  Xk~ i).  Following  Newton’s  example,  we  expand  f\Xk)  as  a  power  series 
in  (xk  —  zii_1): 

fM  =  /(*■— i )  +  f(xk~  i )(%  —  xi,— i )  4 — 

Since  ( xk  —  Xk—  i)  =  rk  ■  jj/’  £  mk,  (xk  —  xk-  i)2  will  be  zero  modulo  tr/'  f  1  So  we  have 

0.  =  /(%•)  s  f(xk  i )  -f  f'{xk-  i  )(x*  —  xk  , )  (mod  m* + 1 )  (2) 


This  equation  can  be  solved  for  the  correction  term  (xk  —  i/,  i)  Unfortunately,  using  this 
iteration  involves  computing  the  inverse  of  f'[xk-  i)  at  each  stage  in  the  iteration.  There  are 
two  choices,  if  we  compute  the  the  inverse  at  each  step,  and  make  use.  of  all  the  information 
available,  the  iteration  converges  quadralically.  On  the  other  hand,  if  we  need  an  iteration 
which  is  only  linearly  convergent  it  is  possible  set  the  iteration  up  so  that  only  one  inversion 
is  performed. 

Since  (xk  —  xk—\)  is  an  clement  of  rn*'/?^  we  need  only  compute  f'[xk~\)  modulo  m.  Again 
using  Taylor’s  theorem,  this  time  to  expand  f'(xk—  i)  at  z<),  wc  get 


m-i) = fM + /"(*D)(*k-i  -«.)+■••= n*o)  (3) 

all  modulo  m.  Combining  (2)  and  (3)  wc  have 

0  =  /(xfc_ 1 )  /'(z«){zfc  —  zfc_ , )  (mod  mM  1 ) 

zfc  —  zfr_ i  s  —  (mod  mfc+l)  (4) 

Since  k  steps  are  required  to  compute  an  approximation  to  r  whose  first  k  terms  agree 
with  r  we  say  that  this  iteration  is  linearly  convergent.  It  is  possible  to  produce  a  quadratically 
convergent  iteration  by  a  very  similar  method.  We  do  this  in  section  1.3. 

Wc  give  a  simple  example  to  illustrate  the  linear  iteration.  Assume  wc  wish  to  compute 
the  square  root  of  an  integer.  The  most  common  method  used  is  to  use  Newton’s  iteration  with 
integers  and  round  everything  off  to  integers.  The  algorithm  we  present  computes  the  square 
root  of  n  using  a  linearly  convergent  iteration.  For  the  starting  point  wc  use  the  square  root  of 
n  computed  modulo  3.  We  begin  by  dividing  out  ail  powers  of  3.  If  an  odd  number  of  powers 
of  3  divided  n  then  n  has  no  square  root.  Now  n  is  congruent  to  either  1  or  2  modulo  3.  If  2 
then  n  docs  not  have  an  integral  square  root  otherwise  the  initial  “guess"  is  1. 

For  /  we  use  }{r)1  —  n  =  0  Equation  (4)  now  reads 


it 


Xk_\  = 


/(Xfc— l) 
2x() 


(mod  mk+l) 


m  ==  (.?}  and  z*,  is  always  I  Both  the  left  and  right  hand  sides  arc  divisible  by  3k  so  this  really 
is  a  congruence  modulo  3.  Since  2l  —  I  (mod  3)  the  correction  term  is 


(2(*t^)(.nod  3)). 
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We  now  derive  the  iteration  for  solving  systems  of  equations.  The  procedure  is  very  similar, 
but  it  is  easy  to  become  mired  in  indices  if  some  sort  of  vector  notations  is  not  used.  On  the 
other  hand,  vector  notations  can  easily  lead  to  wrong  answers  if  the  basis  behind  them  is  not 
thoroughly  understood. 

First,  Taylor’s  theorem  in  several  variables:  Let  }{z\, . . .  ,2„)  be  a  function  in  the  variables 
zi,...,zn.  Taylor’s  theorem  gives  the  power  scries  expansion  at  (i|,...,xn) 


/(* l  >•••>*>,)  =  /(* Xu)  +  Xa  a~(x'» ' ' ' '  x«)te  ~  x«) 


1=1 


(5) 


2 


The  first  summation  in  equation  (5)  can  be  viewed  as  the  dot  product  of  two  vectors 


[z  ~  x)  =  (z\  —  X\,»i  —  x-i,  ■  ■  ■  ,zn  —  xn) 

. «<*» 

where  we  have  used  x  as  an  abbreviation  for  (x|,x2l  •  •  •  ,zn).  If  we  treat  the  first  of  these 
vectors  as  a  column  vector  and  the  second  as  a  row  vector  the  dot  product  is  merely  matrix 
multiplication. 

Using  this  notation,  the  first  two  terms  of  Taylor’s  formula  can  be  expressed  as 

m = /{i) + !?(*)  •(*—*)+•••• 


(It  is  a  good  thing  we  do  not  need  more  terms.) 

If  we  have  several  f’s  then  each  may  be  expanded  as  a  power  series  like  (5).  If  (/i, . . . ,  fn) 
is  treated  as  a  column  vector  we  sec  that 


\s/»  ...  oiL) 

\  ni\  Oz<2  OTn  ' 


This  matrix  is  known  as  the  Jacobian  of  /,  which  we  write  as  J .  We  then  have 

f{z)=f{x)  +  J{x)(z-x)-{ - 

Now  assume  that  i*.  is  a  sequence  of  column  vectors  which  converge  to  a  solution  of  /(«). 
Further  assume  that  f(x *)  s  0  (mod  m*  +  l).  So  the  sequence  of  i*.  is  a  coherent  sequence  which 
converges  m-adically  to  a  zero  of  the  system  of  equations.  Then  using  (12)  we  have 

o  =  f(ik )  =  f(h-i)  +  ■  [h  ~  ifr-i)  (mod  mfc+1). 


Again  we  are  only  interested  in  a  linearly  convergent  iteration  at  this  point.  By  the  reasoning 
used  in  the  univariate  case  we  may  replace  J(x it— i)  by  J( iq)  so 

Zfc  —  ifc-i  =  ■  /(ifc-i). 


This  iteration  is  called  the  IJensel  lifting  algorithm.  Notice  that  it  is  only  applicable  when  the 
Jacobian  is  invertible.  We  state  this  as  a  theorem. 
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Theorem  3.  Let  {  fi(z\ , . . .  ,ztl)  l’1^,  be  a  system  of  a  system  of  equations  over  RtJll  where  m  is 
an  ideal  of  R.  Let  J  be  the  Jacobian  of  {/  }.  If  f(x q)  =  0  (mod  m)  and  if  the  determinant  of 
J{iu)  is  invertible  over  Rl2l/m  then  there  exist  elements  of  R^,  x  such  that  ft{i)  =  0,  and  they 
may  be  determined  effectively. 

The  initial  solution  of  the  system  of  equations,  iq  is  called  the  starting  point  of  the  iteration. 
If  J[z i))  is  invertible  then  iq  is  said  to  be  a  good  starling  point.  If  the  Jacobian  matrix  is  not 
invertible,  cither  because  the  i<)  was  a  bad  starting  point  or  because  the  Jacobian  itself  was  not 
square,  then  the  system  of  linear  equations  which  is  to  be  solved  at  each  step  will  not  have  a 
unique  solution.  There  are  cases  where  this  can  be  useful,  for  instance  as  a  sieve  for  possible 
solutions  to  diophantine  equations  over  the  integers.  This  idea  has  also  been  advanced  by  Lewis 
|Lcw69]  and  Laucr  (Lau78). 

The  quadratic  form  of  the  iteration  uses  J(xi.  _i)  in  place  of  J(x o).  Thus 

£■2*  —  22“-'  =  —  Jfak-i)  f[x2k-i). 

Unfortunately,  this  iteration  requires  that  the  inverse  of  a  new  Jacobian  be  computed  at  each 
stage  of  the  iteration.  Since  R/m  is  field  this  is  easy  for  the  first  step.  Afterwards,  rational 
expressions  may  appear  and  complicate  the  situation.  Also  note  that  J(x^ )  is  invertible  in 
Riu/m 2  if  and  only  if  J(zu)  is  invertible. 


2.  Proof  of  Hensel’s  Lemma. 

Hcnsel’s  lemma  is  usually  formulated  by  investigators  in  algebraic  manipulation  in  terms  of 
the  factoring  problem.  Last  section  has  shown  that  when  the  Jacobian  of  a  system  of  equations 
is  invertible,  it  is  possible  to  lift  a  solution  of  a  system  of  equations  modulo  m  to  a  solution 
in  the  m-adic  completion.  The  version  of  llenscl’s  lemma  which  is  most  commonly  seen  these 
days  deals  specifically  with  the  problem  of  factoring  polynomials.  This  version  indicates  that 
factorizations  modulo  m  can  be  lifted  to  factorizations  over  the  completion  if  the  factors  are 
relatively  prime.  More  precisely, 

Theorem  4.  (llcnscl’s  Lcrnmaj  Let  f(X)  be  a  monic  polynomial  over  R,  m  an  ideal  of  R.  If 
there  exist  elements  of  /?/m|X],  g(X )  and  h(X),  such  that  f[X)  —  g(X)h(X)  is  an  element  of 
Difi(X]  and  g(X)  and  h[X)  are  relatively  prime,  then  g( X)  and  h(X)  6  exist  such  that 

f{X)  =  g(X)h(X)  and  g(X)  -  g(X)  and  k(X)  -  h{X)  are  in  m R\X\. 

To  understand  this  denote  the  coefficients  of  /(X),  g(X)  and  h(X)  by  /,,  gj  and/i*.  respectively. 
The  /,  arc  known  and  the  gi  and  /i*  arc  to  be  determined.  Writing  out  the  system  of  equations 
derived  in  section  1.1  we  have 

g\  +h\  =  f\ 

9i  +-  ffihi  -{-hi  =  fi 

9rK  i  gT—  i  h,  =  /„_  | 

9rK  —  in 

For  the  starting  point  wc  can  use  the  coefficients  of  g(X)  and  h(X).  So  by  theorem  2,  it  is 
possible  to  determine  g(X)  and  h{X)  if  the  Jacobian  of  this  system  of  equations  is  invertible. 


The  Jacobian  happens  to  be 


/I 

0 

0 

0  .. 

1 

0 

0  ...\ 

hi 

1 

0 

0  .. 

9i 

1 

0  ... 

hi 

hi 

i 

0  .. 

9i 

Si 

1  ... 

to 

ha 

0 

0 

...  gr  ) 

The  determinant  of  this  matrix  is  also  called  the  resultant  of  g(X)  and  h(X).  It  is  known  to 
be  zero  if  and  only  if  g  and  h  have  a  nontrivial  common  factor.  If  the  determinant  is  non-zero 
then  we  will  be  free  to  use  the  linear  llcnsci  iteration  as  far  as  we  please. 

To  see  how  this  formulation  actually  works,  consider  factoring  the  polynomial 

f{Z)  =  Z(‘  +  67 Z*  -f  41 Z 1  -f  2781Z  *  +  737 Z'1  +  943 Z  +  253 

over  the  rational  integers.  For  this  sort  of  problem,  we  choose  m  to  be  an  ideal  generated  by 
some  rational  prime.  The  initial  point  is  determined  by  using  the  Bcrlekamp’s  algorithm  |Ber70] 
for  factoring  polynomials  over  finite  fields. 

To  get  an  idea  what  its  factorization  will  be  over  the  integers,  f(Z)  is  factored  modulo 
the  primes  between  2  and  199.  We  discover  that  modulo  2,  3,  5,  17,  67,  71,  127  and  179,  /  is 
the  product  of  two  factors  of  degree  3  With  other  moduli  /  splits  into  more  pieces.  Careful 
examination  of  the  2  cubic  factors  shows  that  each  has  one  term  whose  coefficient  is  zero.  In 
fact  this  is  true  for  all  irreducible  cubic  factors  for  any  modulus.  Thus  we  conjecture  that 

f{Z)  =  {Z3  -f  aZ*  +  b)(Z 3  +cZ  +  d). 

While  we  would  not  expect  an  implementation  of  a  general  purpose  factoring  algorithm  to  apply 
this  trick,  we  will  use  it  for  two  reasons.  First,  it  simplifies  the  equations  which  are  developed 
somewhat,  and  second,  this  example  will  also  indicate  how  appropriate  additional  knowledge 
about  the  problem  to  be  solved  can  be  utilized.  Multiplying  the  candidate  factorization  of  f(Z) 
out  we  arc  led  to  the  following  system  of  equations 

a  =  67 

c  =  41 

6  +  ac  +  d  =  2781 
ad  =  737 
be  =  943 
bd  =  253 


(6) 


Since  we  knew  (or  at  least  hoped)  that  two  of  the  terms  in  the  factorization  were  zero, 
there  are  two  more  equations  than  unknowns.  In  fact,  the  equations  are  now  trivial  to  solve. 
To  make  things  a  bit  more  interesting  we  will  ignore  the  first  two  equations.  We  use  5  as  the 
modulus,  i.e. ,  m  =  (5).  Bcrlckainp's  algorithm  gives  f[Z)  =  (Z 3  -f-  2Z  —  2 ){Z*  1),  so 

for  initial  approximations  we  have  <j<j  =  2,  bo  =  — 2,  cu  =  1  and  db  —  1.  The  inverse  of  the 
Jacobian  of  the  system  of  equations  is 


J  '  = 


/co 

i 

0(1 

1 

f~l 

0 

-1 

do 

0 

0 

«o  1  _ 

f  l 

-1 

1 

0 

Co 

bo 

0  1 

1  ~2 

2 

1 

v  6 

do 

0 

bo) 

K-2 

2 

-2 
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To  compute  the  next  stage,  we  first  determine  the  correction  terms  which  were  denoted  by 
f[x A  — i)  in  the  previous  section.  To  do  this,  the  equations  in  (6)  arc  evaluated  at  the  starting 
point  modulo  25.  We  then  get  a  column  vector  which  is  multiplied  by  —  J~ 1 


Adding  this  to  the  term  starting  point,  we  have  aj  =  —8,  6i  =  — 7,  C|  =  —9  and  d\  =  II. 
Repeating  this  process, 


la  1  \ 

(  50  \ 

6, 

=  —J'1  ’ 

50 

-50 

w 

\dj 

V  —25  / 

25 

50 


So  a-2  =  —58,  b-/  =  23,  ci  =  41  and  di  —  1 1.  Repeating  this  iteration  one  last  time  we  are  led 
to  the  final  factorization 


f(X)  =  (X3  +  MX1  +  23)(X3  +  HX  +  II). 

3.  Zassenhaus'  Formulation  of  Hensel's  Lemma 

The  “old”  version  of  Ilcnsel’s  lemma  was  first  proposed  by  Zassenhaus  [Zas69],  Wang  and 
Rothschild  [Wan75]  and  Musser  [Mus75]  utilized  Zassenhaus’  ideas  in  their  factoring  algorithm. 
Using  the  ideas  of  Moses  and  Yun  |Mos75],  Yun  |Yun73,  Yun76]  investigated  the  general 
applicability  of  Hensel’s  lemma  to  problems  in  algebraic  manipulation. 

As  formulated  by  Yun  [Yun76],  Zassenhaus’  version  of  Ilcnscl’s  lemma  differs  from  ours  in 
that  the  number  of  equations  produced  is  smaller  than  the  number  of  variables.  For  instance, 
in  the  factoring  problem,  the  Zassenhaus  approach  determines  G  and  H  by  solving  the  equation 

/(*)  —  GH  =  0  (7) 

It  is  only  when  the  solution  of  this  equation  is  restricted  to  the  ring  ZP\X]  that  (7)  has  a  unique 
solution.  By  using  a  p-adic  technique  the  non-linear  diophantine  equation  (7)  can  be  reduced 
to  a  scries  of  easy  to  solve,  linear  equations.  Then  by  piecing  together  the  solutions  of  these 
linear  equations,  G  and  //  can  be  determined.  The  procedure  used  is  not  very  complex.  Here 
we  will  outline  the  main  points  and  demonstrate  the  connection  with  our  formalism. 

Let  f(X)  be  a  univariate  polynomial  over  the  integers,  and  assume  that  it  has  two  irreducible 
factors,  G  and  II.  We  know  that  Go  and  //()  are  factors  modulo  p  and  vve  want  to  lift  them  to 
G  and  H .  Writing  G  and  II  p-adically 

f(X)  =  (G„  +  G,p  +  •  •  )(//«  +  WiP  +  •) 

—  Go//o  +  (//|Go  +  G|7/o) 

We  know  that  f(X)  —  Gn/fo  is  a  multiple  of  p,  so  dividing  by  p  wc  have 

lit  Go  +  G,  /In  -  (/(X)  -  GoH,)/p  (mod  p)  (8) 
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This  is  a  linear  diophantine  equation  in  G\  and  H\.  Its  solution  is  obtained  by  first  solving 
A, Go  4  B,Ho  =  X'  (mod  p)  for  various  i  by  the  Euclidean  algorithm.  Since  the  left  hand  side 
of  (8)  is  a  polynomial  in  X,  G\  and  H\  can  be  determined  by  adding  up  the  appropriate  B,  and 
Ai  respectively. 

Having  computed  G \  and  H\,  the  other  terms  may  be  computed  similarly.  For  a  linear 
iteration,  the  process  continues  as  follows: 


f(X)  =  (Q,  +  GlP  +  G2p2  4  ■  •  •)(//„  +//,P  +  HzP 2  4  •  •  •) 

Considering  this  equation  modulo  p3 

(H2G»  +  GJk)  =  (f(Z)  -  (Gb  4  G.  p)(//„  +  //,  p))/pi  (mod  p) 

The  right  hand  side  of  this  equation  is  again  a  polyonomial  and  the  left  hand  side  is  essentially 
the  same  as  (8).  So  using  the  Ai  and  B,  obtained  in  the  previous  step  we  can  compute  G2  and 

H2- 

As  an  illustration  consider  the  polynomial  which  we  factored  in  the  last  section 

f{Z)  =  Zfi  4  67Z'5  +  41 Z3  +  2781 Z*  +  737 Z2  +  943Z  4  253 
As  before  we  know  that  its  factorization  modulo  5  is 
=  (Z3  +  2Z2  -  2 )(Z3  +  Z  +  1). 

So  we  now  want  to  solve 

H,G o  +  GJi u)  =  f(Z)  -  (Z>  +  2Z2  -  2)(Z3  +  Z  +  1) 

=  -10Z5  -  10Z 1  4  5Z3  4  10Z2  -  5Z 

modulo  52. 

Since  G$  —  Z3  4  2Z2  —  2  and  Hq  =  Z3  4  Z  4  1  the  diophantine 
solve  are 

AiG„  4  BiH0  =  Ai[Z 3  4  2Z2  -  2)  4  Bt(Z 3  4241)  =  Z'  (10) 

where  we  require  degB,  <  deg  Co  =  degC  and  deg  A,  <  deg  Ho  =  deg  H.  This  condition  on 
the  degrees  of  /I,  and  B,  we  call  the  degree  constraint.  As  we  shall  sec,  there  is  only  one  solution 
to  (10)  which  also  satisfies  the  degree  constraint. 

4)  and  Bo  can  be  determined  by  the  usual  application  of  the  Euclidean  algorithm  to  linear 
diophantine  equations  [Knu69].  Wc  begin  by  computing  the  continued  fraction  of  IIo/Go.  Then 
the  convcrgents  of  the  continued  fraction  arc  computed;  the  next  to  last  one  being  the  one 
which  we  want.  Following  the  usual  schema  for  continued  fraction  computations,  we  have: 
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equations  we  need  to 


i 

2Z 

2Z 

— 2Z  4  1 

ma 

Kg 

o 

2Z  4  1 

-. Z 2  4  2Z  4  1 

2Z3  4  2Z  4  2 

i 

H 

n 

2Z 

-Z2  4 1 

2Z3  -  Z2  4  l 

The  first  line  consists  of  the  “partial  quotients”  of  the  continued  fraction  (the  quotients  in 
the  PH.S)  and  the  next  two  lines  are  the  convcrgents  of  the  continued  fraction.  So  we  have 
An  =  2 Z2  4  Z  —  2  and  /Jo  =  — 2Z2  -(-  2Z  after  removing  a  unit  and  multiplying  fifo  by  —1. 
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The  other  A,  and  B,  can  be  computed  from  Ao  and  R>  by  a  simple  device.  Notice  that  while 
Z'Au  and  Z'B i  probably  do  not  satisfy  the  degree  constraints  they  do  satisfy  equation  (10). 
Pick  Qand/t,  such  that  ZlAo  =  HaQ-\~  A,  and  dcg/1,  <  deg  Ho .  Now  define  B,  =  Z'Bi-\-QGo- 
A,  and  B,  will  also  satisfy  (10)  and  now  /t,  also  satisfies  the  degree  constraint.  It  is  not  hard  to 
see  that  B,  also  satisfies  the  degree  constraint.  Let  A,  and  Bt  be  solutions  of  (10)  which  are  of 
the  appropriate  degrees.  Then 

(A,  -  A,)G0  +  (B,  -  Bi)H0  =  0  (mod  5) 

Remember  that  Go  and  Ho  are  relatively  prime.  The  degree  of  A,  and  4,  is  less  than  deg  Hoi 
thus  the  degree  of  B,  —B,  must  be  less  than  deg  Go.  Since  B,  's  degree  is  small,  so  is  B,.  Since 
Gi  and  Ho  relatively  prime  A,  —  At  and  B,  —  13,  Thus  not  only  do  A,  and  B,  have  the  correct 
degrees  they  form  the  unique  solution  of  (10)  for  a  given  t. 

Repeating  this  procedure  we  easily  get  the  following  table  for  the  basis  of  the  solutions  of 
linear  diophantine  equations  of  the  form  /1G,»  +  FHo  =  C. 


A0  =  2Z2  +  Z  -  2 

Bo  =  -2  Z2  +  2 

A{  =Z2  +Z  -2 

B,  =  -Z2  +  22+1 

/i2  =  Z2  +  2Z  -  1 

B2  =  —Z2  +  2-2 

=  2Z2  -  2Z  -  1 

Bj  =  -2 Z2  -  22  - 

Ai  =  -2  Z2  +  22-2 

S1  =  2  Z2  -22  +  1 

A5  =  2Z2  +  2 

=  - Z 2  +2-1 

Dividing  equation  (9)  by  p  gives 

//,G0  +  G\H{)  =  —2 25  -  2 Z'1  +  2*  +  2Z2  -  Z  +  1  (mod  5) 

To  compute  //,  we  use  the  linear  basis  for  the  solutions  which  were  just  determined. 

H{  =  -2 A5  +  -2/1.,  +  A}  +  1Ai  -  /t,  +  /t0 
=  -2  Z  +  2 

One  factor  of  f[Z),  computed  to  second  order,  is  Ho  +  5//i  =  Z3  —  9Z  -f*  11  as  we  determined 
earlier  using  our  formulation  in  the  previous  section. 

Thus  far,  the  liftings  considered  have  been  at  a  principal  ideal.  This  restriction  can  be  lifted 
without  much  difficulty ,  but  then  this  old  formulation  becomes  more  complex.  The  complications 
involved  also  obscure  some  of  the  relationships  with  our  formulation.  These  modifications  are 
contained  in  a  paper  of  Yun  (Yun76|  and  the  references  contained  therein.  Since  we  feel  that  our 
formulation  is  easier  to  understand,  more  powerful  and  easier  to  implement  than  this  version 
of  Kennel's  lemma  we  will  not  discuss  these  improvements. 

The  Zassenhaus  version  of  llcnsel’s  lemma,  though  somewhat  more  complex,  than  our 
version  still  produces  the  same  answer — the  correct  factorization  though  each  of  the  lifting 
stages.  In  the  following  paragraphs  we  will  demonstrate  that  this  is  not  a  fortuitous  accident 
but  is  due  to  the  fact  that  both  formulations  arc  performing  the  same  computation  but  in  a 
slightly  different  manner. 

The  key  to  using  Ilenscl’s  lemma  to  lift  a  factorization  F  —  G,,//,)  (mod  m)  to  the  m-adic 
completion  is  solving  the  diophantine  equations 

A, Go  -f-  B,H„  =  Z'  (mod  m)  (11) 
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Once  wc  have  obtained  the  A,  and  Bu  it  is  merely  a  matter  of  computing  the  error  introduced 
when  a  factorization  (mod  m*  +  l).  This  error  is  then  used  to  produce  the  appropriate  linear 
combination  of  the  A,  and  B,  which  is  G^ i  and  Hk+ \- 

Our  formulation  is  quite  similar.  We  begin  by  inverting  the  Jacobian  matrix.  Through  the 
Newton-Raphson  formula  we  combine  the  error  terms  to  compute  Gt,  + 1  and  ffk+i-  Structurally, 
these  two  algorithms  arc  quite  similar.  Actually  they  arc  identical.  Wc  shall  see  that  the  solutions 
of  (11)  form  the  inverse  of  the  Jacobian  matrix. 

To  see  this  let, 

G0=Zr  +  9]Zr-'  +  9iZr~2  +  -+9r 

//0  =  Z"  +/i1Za-'  +  hZ’-2  +-+h, 

so  the  Jacobian  matrix  is 

1  0  0  0  1  0  0  0 

hi  1  0  0  g  |  1  0  0 

h2  hi  1  •  0  9i  g{  1  0 

0  -  0  ha  h„_  i  0  0  gr  gr—\ 

0  0  0  ha  0  0  0  gT 

Consider  what  happens  when  this  matrix  is  multiplied  by  a  column  vector 

+  **) 

aoffi  +  “i  +  h  +  <>ohi 

Or—  \9r—  1  4"  ar—l9r  “I-  &«— 2^»  4"  —  l^1*  — l 
Or—l9r  +  &»  — 1^* 

The  (r  4-  «  —  1)  —  i  row  of  this  column  vector  is  clearly  the  coefficient  of  Z'  in  AGq  -|-  Bf Iq 

u/hpfp 

A  =ooZ«-'  +a,Zs-2  +  --  +  a,-u 
B  =  bf)ZT~ 1  +  6,Z'-2 +  •••  +  *_,. 

Thus  the  computation  of  the  inverse  of  the  Jacobian  of  the  system  of  equations  is  a  clever  way 
of  obtaining  all  the  solution  of  (11).  Or,  as  we’d  prefer  to  view  it,  solving  (11)  is  a  clever  way 
of  inverting  a  very  special  type  of  Jacobian  matrix. 


Chapter  VII 


The  Sparse  Hensel  Algorithm 


The  last  chapter  presented  our  formulation  of  llenscl’s  lemma  as  well  another  formulation 
popularized  by  Zasscnhaus.  The  univariate  polynomial  factorization  example  that  was  used  to 
illustrate  the  algorithms  did  not  point  out  their  major  failing  since  this  inadequacy  only  appears 
in  multivariate  problems.  The  purpose  of  this  chapter  is  to  demonstrate  that  the  same  techniques 
that  were  used  in  our  version  of  the  modular  algorithm  can  be  applied  to  our  formulation  of 
the  llcnsel.  Significantly,  we  are  led  to  a  class  of  probabilistic  p-adic  lifting  algorithms  that  run 
in  polynomial  time. 

To  illustrate  the  difficulties  inherent  in  multivariate  problems  we  will  again  consider  the 
polynomial  factorization  problem.  Recall  that  the  key  idea  is  to  convert  a  problem  of  factoring  a 
polynomial  in  the  polynomial  ring  7J[X,  X2, . . . ,  X„]  to  one  of  finding  a  solution  (in  7i[X2, . . . ,  X,,]) 
of  a  system  of  equations.  To  apply  Newton’s  method  we  attempt  to  find  solutions  which  lie  in  a 
completion  of  Il\X 2,  •  • ,  X,,].  Usually,  the  particular  completion  considered  is  m  =  (X2, . . . ,  Xv). 

Consider  the  problem  of  factoring  the  square  free  polynomial 

F(X,  X2 ,  X;,)  -X<  +  (X,X;1  4-  X2Xt  +  X;]  +  2)X3 

4-  (X^X*  4-  2X2X;|  -  3X^X1  +  X\Xi  4-  3Xij)X2 

4-  {— 3X®X3  4-  X2X^  -  3X*Xi  4-  X.JXi  -  X2Xj  4-  3X2  -  2)X 

4-  (  3X2X1 4-  3X2X3  4-  X\  -  1). 

By  picking  a  few  values  for  X2  and  Xi,  we  can  quickly  convince  ourselves  that  unless  F  is 
irreducible,  it  factors  into  2  quadratic  polynomials. 

F(X,  0, 0)  =  (X  —  1)(X  4-  l)3 
F(X,~  1, 1)  =  (X2  -  2X  -  2)(X2  4-  X  4-  4) 

F(X,  3,  5)  =  (X'2  4-  17X  -  3644)(X2  4-  1902X  4-  26) 

F(X,  -5,  7)  =  (X2  —  I2130X  —  I26)(X2  -  33X  4-  65626) 

Notice  that  even  though  F(X,  0, 0)  .splits  into  lincars,  there  is  only  one  way  to  combine  the 
line  irs  into  a  factorization  into  two  <|iiadratics.  Letting  the  factorization  be  F(X)  =  (X2  + 
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AX  -j -  B)(X2  -f-  CX  -f-  D)  we  get  the  following  system  of  equations 


A  +  C  -  (X2X<  +  X2X3  +  X\  +  2) 

AC  +  B  +  D  =  (X\X\  +  2X2X'l  -  3X5*3  +  X\X3  +  3X5) 

AD  4-  BC  —  (-3*5*5  _j-  *2*3»  _  3*5*3  -f  *5*3  -  *2*3  -f  3*5  —  2) 

BD  =  (-3*5*3  +  3*5*:j  +  *5  -  1) 

We  are  looking  for  elements  /l ,  B,  C  and  D  of  /?[*2,  *3]  that  satisfy  this  system  of  equations. 

Using  Newton’s  method,  the  solution  to  (1)  will  be  generated  as  a  multivariate  power  scries 
in  *2  and  *(.  This  solution  is  developed  one  term  at  a  time,  first  the  constant  term,  then  the 
linear  terms,  and  so  on.  To  get  the  initial  approximation,  we  would  like  to  set  *2  and  *3  to 
zero  and  solve  the  resulting  simpler  system.  This  system  would  usually  be  solved  by  factoring 
F(X,  0, 0).  In  this  case  (I)  reduces  to  the  simple  system 

A  +  C  =  2, 

AC  +  B  -f  D  =  0 
AD  +  BC=  -2 
BD  =  —I 

F rom  the  univariate  factorization  we  see  that  this  system  has  two  solutions,  cither  (/4,  B,  C,  D) 
(0,  — 1,  2, 1)  or  (2,  1,0,  —1).  This  corresponds  to  the  commutativity  of  the  factors.  We  will  choose 
the  A  —  0  solution.  The  other  solution  will  give  a  similar  result,  but  with  A  and  C,  and  5  and  D  in¬ 
terchanged.  Newton’s  iteration  for  systems  of  equations  takes  the  form 

ifc  =  ifc-i  —  ■/-'(*[>)  •  l[ik- 1), 


where  J  is  the  Jacobian  matrix  of  the  system.  Let  the  column  vector 


denote  the  initial  approximation,  and  the  column  vector 


f(A,B,  C,  D) 


(  A  -\-C  —  f\ 
/IC+5  +  D-/2 
AD  +  BC-h 

\  BD-U 


represent  the  system  of  equations  which  is  to  be  solved.  We  have  used  fc  as  an  abbreviation 
for  the  coefficient  of  *'"'  in  F(X,X2,X 3). 

For  the  system  of  equations  in  (1)  the  Jacobian  matrix  is 

(101  0\ 

C  l  A  l\ 

D  C  B  A  I 
0  D  0  B/ 
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When  evaluated  at  X2  =  0  and  X3  =  0,  the  Jacobian  is 

/I  0  1  0  \ 

2  10  1 
12-1  0  | 

VO  1  0  -1/ 

The  next  step  is  to  invert  this  matrix.  Unfortunately,  it  is  singular.  Thus  we  will  not  be  able 
to  lift  the  factorization  at  the  ideal  (X2  —  0,  Xj  —  0)  and  must  try  some  other  ideal. 

If  we  try  (X2  -f  1,  X3  —  1)  then  we  get 

F(X,  -1, 1)  =  X4  -  X3  -  10X  -  8 

=  (X2  -  2X  -  2)(X2  +  X  -M)' 

The  corresponding  Jacobian  matrix  is 

0  1  0  \ 

1  -2  l  1 

1  -2  -2  f 

4  0  -2/ 

which  is  invertible.  The  problem  with  using  this  ideal  is  the  factorization  will  be  developed  in 
a  multivariate  power  scries  in  X2  +  1  and  X3  —  1.  Since  the  actual  factorization  is 

F(X,  X2,  X:1)  =(X2  -f  (X2X3  +  2)X  -  3X^X3  -f  1) 

(X2  +  (X2X«  +  X3)X  +  X23-l), 

these  factors  will  have  16  and  17  terms  respectively,  when  written  in  terms  of  X2  -\-  1  and 
X3  —  1.  If  there  are  more  variables,  the  number  of  terms  produced  will  increase  exponentially. 

The  solution  for  the  Dense)  algorithm,  just  as  for  the  modular  algorithm,  is  to  perform  the 

lifting  one  variable  at  a  time.  We  replace  the  X,  by  T,  -f  a,  After  performing  the  lifting  for 

Yi,  the  answer  will  be  dense.  But  after  converting  it  back  to  a  polynomial  in  X,  the  answer 
can  be  no  more  dense  than  the  final  answer.  Section  I  presents  the  details  of  the  sparse  Dense) 
algorithm  using  the  formalism  developed  in  the  last  two  chapters.  Section  2  points  out  a  number 
of  “tricks”  which  can  be  used  to  great  advantage  in  an  implementation  of  the  sparse  Hensel 
algorithm. 


1.  Introduction  of  Sparse  Techniques 

Following  this  procedure,  we  will  solve  (1)  for  a  power  series  in  X2.  Since  X3  will  be  1 
throughout  this  portion  of  the  computation,  we  can  simplify  (1)  to 


A  +  C  =  {2X2  +  3) 


AC  +  D  +  D  =  (—3X5  +  *2  +  3*z  +  X]  +  2X2) 
AD  -1-  DC  =  (-3x5  -  3x5  +  X\  +  iX\  -  2) 
m  -  (-3X5  +  3X3  +  X:]  -  1) 
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Since  we  will  be  working  with  power  series  in  X2  -f-  1  it  will  simplify  some  things  to  write  these 
equations  in  terms  of  Y2  =  X2  -f-  1 


(2) 


A  +  C  =  Y\  -  3Y]  -{-  5Y2  -  1 
AC  +  B  +  D  =  -3 Y\  -f  16  Y\  -  31  Y\  +  28^  -  10y2 
AD  +  BC  =  — 3^2  +  24KJ  -  87 K ^  +  186  y^  -  254^ 

4-  227 Y]  -  132^  -f  47y2  -  10 
BD  =  -3y*  -f  24 Y\  -  84y'j  +  171  Y\  -  225 Y\ 

+  i99y^  -  ii7y^  +  42y2  -8 

To  simplify  the  computations,  we  will  again  work  with  a  rather  small  modulus,  this  time 
163.  The  inverse  of  the  Jacobian  is  then 


-J-'  = 


72 

18 

18 

-9 

36 

36 

-18 

36 

-73 

-18 

-18 

9 

72 

72 

-36 

—9 

Subtracting  the  left  hand  side  of  (2)  from  the  right,  and  using  the  initial  approximations  A  = 
—  2,  B  =  —2,  C  —  1  and  D  ~  4,  we  sec  the  first  term  of  the  error  is 


-sy2 
ioy2 

-47  y2 
-42  y2 

Notice  that  in  selecting  the  error  terms  we  ignored  all  but  the  lowest  order  ones  since  only  they 
will  affect  the  correction.  When  multiplied  by  — J~~ 1 ,  we  have  the  first  correction  term 


/  72 

18 

18 

“9\ 

/  — sy2  \ 

f  iY2  \ 

36 

36 

-18 

36  1 

ioy2 

m 

-73 

-18 

-18 

9  I 

-47y2 

Y2 

V  72 

72 

-36 

-9/ 

V-42  y2/ 

V-15  Y2) 

To  compute  the  next  correction  term,  we  replace  A,  B,  C  and  D  by  —2  -f-  4y2,  — 2  ■+•  3 y2, 
1  -f  y2  and  4 —  15y2  respectively  in  (2).  The  only  nonzero  terms  should  be  of  degree  2  or  greater 
in  y2.  Continuing  this  iteration  we  discover  that 

A  =  -2  +  4T2  -  3T|  +  Y\ 

B  =  -2  +  3y2  -  3Y\  +  Y\ 

c  =  i  +  y2 

d  =  4-\5Y2  +  3oy^  -  3oy^  -f  i5y^  -  3y^ 

Reexpressing  these  solutions  in  terms  of  X2  we  have 

A  =  X]  +  X2 
B  =  X2  —  1 
C  =  X2  +  2 
D  =  -2X\  +  1 
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So  our  factorization  thus  far  is 

F(X,  X2,  1)  =  {X2  +  (Xi  4  X2)X  +  X:}  -  l)((X2  4  (X2  4  2)X  -  3X®  4  l) 

Now  we  need  to  construct  a  system  of  equations  for  the  coefficients  of  X2  so  A,  B,  C  and  D 
can  be  lifted  to  polynomials  in  Xj.  We  now  know  that  A(X3  —  1)  =  Xj  4  X2-  We  now  make 
our  standard  probabilistic  assumption,  that  A  =  aX2  4  0X2  where  o  and  0  are  polynomials 
in  Xj.  Continuing  we  have 

A  =  aX\  +  0X2 
B  ~  ',X\  4  6 
c  =  tX2  +  t 

D  =  r)X\  +  0 

Consider  the  first  equation  of  (1)  after  having  made  these  replacements. 


A  +  C  =  aX]  +  0X2  +  tX2  4  f  =  (X2Xj  4  X2X3  +X*  +  2) 


Since  the  Greek  letter  variables  do  not  involve  X2  we  arc  free  to  equate  the  coefficients  of  the 
powers  of  X2.  This  gives  the  following  equations 

a  =  1 

p  +  t  =  X$  +  X3 

f  —  2. 

Doing  this  with  the  other  equations,  we  have 


i?  =  -3X3 

at  =  X3 

of  4  7  =  3 

inn 

* 

II 

ar\  =  —  3Xj 

r,0  =  — 3Xj 

7«  =  X) 

<*0  4  7C  =  3 

09  4  St  =  Xj  -  Xj 

<5f  =  -2 

r?  =  —  3Xi 

70=  1 

60  =  -L 

a  =  1 

0  4  e  =  X  j  4  x3 

f  =  2 

These  equations  can  be  solved  by  the  same  technique  that  we  used  before,  or,  as  we  point 
out  in  section  3,  they  can  be  solved  by  inspection.  In  the  rest  of  this  section  we  will  try  to  make 
more  precise  the  technique  that  was  just  used  in  factoring  F. 

As  this  example  shows,  the  Henscl  algorithm  is  somewhat  more  complex  than  the  modular 
algorithm.  Unfortunately,  this  algorithm  also  has  a  reputation  among  system  implementors  as 
being  one  of  the  more  difficult  algorithms  to  implement.  Though  this  may  be  true  of  the  old 
formalism,  our  new  formalism  is  quite  simple  and  to  a  very  large  degree  reduces  the  duplication 
of  code  which  seems  to  have  been  necessary  in  the  older  implementations  we  have  examined. 
The  program  that  solves  systems  of  algebraic,  equations  using  p-adic  techniques  is  just  a  page 
or  two  of  code.  Interfacing  this  code  to  any  particular  problem,  such  as  computing  GCD’s, 
factoring  polynomials  and  computing  square  free  decompositions  is  merely  a  matter  of  setting 
up  the  appropriate  system  of  equations. 

The  first  algorithm  we  present  is  the  univariate  version  of  Newton’s  iteration  for  systems 
of  equations.  It  is  fairly  straightforward  and  should  present  no  problems. 
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Algorithm  U  implements  the  univariate  version  of  the  sparse  Hcnsel  algorithm.  It  assumes 
that  pi, are  unknowns,  X  is  a  variable  and  =  /  is  a  system  of  polynomials 

in  X,gi, ...  ,gn,  m  >  n  It  is  also  given  a  point  X  =  a  and  an  initial  solution  gt, . . . ,  gn  such 
that  /, (j| , . . . ,  p„)  =  0  (mod  X  —  a).  Finally  it  is  given  an  integer  k.  This  algorithm  returns 
polynomials  G^  . .  .,G„  such  that  /,(G|, . . . ,  G„)  =  0  (mod  (X  —  a)fc+‘). 

Ul.  [Jacobian]  Compute  the  Jacobian  of  the  system  /  at  the  point  X  =  a. 

U2.  [Make  it  square]  Select  n  linearly  independent  rows  of  the  Jacobian  and  call  the  resulting 
matrix  J.  Renumber  the  so  /,  corresponds  to  the  ith  row  of  J. 

U3.  [Invert]  Invert  the  Jacobian  to  produce  J~ 

U  l.  [Shift  system  to  origin]  Replace  X  by  Y  -j-  a  in  the  /,.  (Now  /,(p,, . . . ,  p„)  =  0  (mod  Y).) 
Set  G|  ♦—  p| ,  • . . ,  G„  «—  gn. 

U5.  [Iterate  over  number  of  desired  terms]  For  j  —  I  through  k  do  steps  U6  and  V7. 

U6.  (Iterate  over  number  of  equations]  For  i  =  1  through  k  set  Ei  to  the  F'-term  of 
/  {G, , . . .  ,G„). 

U 7.  [Newton’s  iteration!]  For  i  =  1  through  n  set  G  <—  G  E. 

U8.  [Shift  back  to  old  variables]  Replace  Y  by  X  —  a  in  Gi, . . .  ,Gn. 

U9.  [Done]  Return  G| , . . . ,  G„. 

The  multivariate  portion  of  this  algorithm  is  the  part  that  takes  advantage  of  the  sparsity 
in  the  system.  It  uses  algorithm  U  to  compute  the  the  solution  in  one  variable  of  the  system 
of  equations.  It  then  uses  this  solution  to  create  a  new  system  of  equations  where  the  new 
unknowns  are  the  coellicicnts  of  the  polynomials  which  were  just  determined  by  algorithm 
U.  The  variables  S|,...,S„  are  the  solutions  to  the  original  system  of  equations,  as  they  are 
improved  with  ttie  introduction  of  each  new  variable. 

Algorithm  M  does  a  multivariate  Ilcnsel  lifting  It  takes  as  input  the  unknowns  g\,...,gn, 
variables  X\, . .  .,XV  and  a  system  of  polynomials  f\, . . .  ,/„,  in  the  g,  and  Xj,  m  >  n.  It  is  also 
given  a  list  of  evaluation  points  ai,.  ..,a,,  and  initial  values  for  the  variables  such 

that  fi(gt,...,gn)  =  0  (mod  (X|  —  oi , .  , Xv  —  a„)). 

Ml.  [Initialin,  Initialize  the  variables  Sj , . . . ,  Sn  to  the  symbols  pi,...,p„  respectively. 

M2.  [Loop  through  variables]  For  t  =  1  through  v  do  M3  up  to  M6. 

M3.  [Remove  most  of  the  variables]  Replace  X^i, . . . ,  Xv  in  the  /  by  0,4.1, . . . ,  av  to  get 
polynomials  F| , . . .  Fm 

ML  [Solve  univariate  problem]  Use  algorithm  U  on  g,,  Xi,fj,a,  and  gi  to  get  solutions  to 
Gi. 

M5.  (Update  the  original  solution]  Replace  the  symbols  <?,  in  S|, . . .  ,Sn  by  the  values  of 
the  Gj. 

M6.  [New  equations]  For  each  /,  replace  the  <7,  by  G,.  The  result  is  a  polynomial  in  Jf,. 
The  coefficients  arc  the  new  /4. 

M7.  [Done]  Return  the  S,  as  the  solution. 
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2.  Some  Ideas  for  Implementations 

In  this  section  we  will  consider  a  couple  of  useful  heuristics  that  can  often  dramatically 
speed  up  the  running  time  of  the  sparse  Ilcnscl  algorithm.  However  there  are  problems  for 
which  these  heuristics  do  not  help,  so  they  do  not  affect  the  analysis  of  the  running  time  of  the 
algorithm.  These  heuristics  were  first  used  by  Paul  Wang  [Wan78]. 

Recall  the  system  of  equations  (2)  that  was  encountered  in  the  example  at  the  beginning 
of  this  chapter.  Ignoring  several  of  the  equations,  and  reordering  them  somewhat  we  have 

V  =  —3X3  or?  =  —3X,  yr/  —  —3X3 

rip  =  —3X\  ae  =  X3  7O  =  l 

60-=—  1 

Reading  these  equations  horizontally,  the  values  for  the  other  variables  just  drop  out. 

In  general  this  heuristic  proceeds  as  follows:  (1)  Isolate  all  linear  equations  with  one  unknown. 
(2)  In  the  system  of  equations  to  be  solved,  replace  all  occurrences  of  the  unknowns  just 
determined  by  their  values.  (3)  Repeat  until  there  are  no  linear  equations  with  one  unknown. 

This  procedure  is  very  cheap;  the  only  point  that  could  be  expensive  is  the  substitutions  in 
step  (2).  Since  this  procedure  can  be  performed  before  the  variable  being  lifted  is  shifted  away 
from  zero,  step  (2)  can  be  no  more  expensive  than  verifying  the  sparse  form  of  the  answer. 

In  section  3  we  will  analyze  the  sparse  Hensel  algorithm.  Here  we  only  discuss  the  probability 
that  this  heuristic  will  actually  be  beneficial.  It  is  rather  difficult  to  determine  precisely  what 
the  probability  that  this  procedure  will  be  successful  is,  but  we  can  make  some  rough  estimates 
in  the  factorization  problem  with  the  following  observation.  Assume  that  we  wish  to  factor  F, 
a  polynomial  in  v  variables.  Assume  F  has  2  monic  irreducible  factors 


F  =  [M0  -f  m,  H - 1-  mr)(M|  -f  mr+1  H - 1 -  mt) 


where  Mi  are  both  monic  monomials  and  m,  are  monomials  whose  coefficients  are  to  be 
determined.  If  the  exponent  vectors  of  all  of  the  m,  are  distinct  we  will  be  able  to  determine 
their  coefficients  by  the  procedure  just  outlined. 

Assume  that  the  degree  of  each  variable  in  the  monomials,  m,,  is  bounded  by  D.  Then 
there  are  N  =  (D  -f-  1)'  possible  exponent  vectors.  There  are  Nl  ways  t  monomials  could  be 
chosen  (we  are  ignoring  the  fact  that  some  arc  permutations  of  the  others)  and  N\/(N  —  f)! 
ways  t  different  monomials  could  be  chosen  (again  disregarding  the  permutations).  Therefore 
the  the  probability  that  the  system  of  equations  is  “presolved”  is  P(t,v,D) 


P(t,v,D)> 


N\ 

Nl 


N\ 


{ N  -  t)\Nl 

(*)‘ 

V/v-i 


t  (Uf±)Nl 

by  Stirling’s  approximation.  Simplifying  this  somewhat 


(4) 


Notice  that  as  N  —  t  goes  to  infinity,  holding  t  fixed,  the  quantity  in  square  brackets  tends 
towards  ef.  So  for  large  N,  P{t,  v,  D)  is  quite  close  to  I.  Computing  the  Taylor  series  expansion 


of  (4)  at  N  —  oo  we  find 


1 12  —  f 

P(t,v,D)< 

So  the  chances  that  this  technique  will  not  be  useful  is  somewhat  greater  than 

1  t2-t 

2  (D+  1),'‘ 

As  the  number  of  variables  increases  this  becomes  very  small.  In  fact  for 

log  t 2  —  t  logt 

v  >  log (D  +  1)  ~  log(D  +  1) 

we  would  expect  the  equations  to  always  be  prcsolved. 

Wang  uses  a  technique  that  is  essentially  equivalent  to  this  in  his  implementation  of  the  EEZ 
algorithm  To  illustrate  how  dramatically  it  can  afTect  the  running  time  of  the  Henscl  algorithm 
have  timed  four  different  algorithms  on  the  first  set  of  examples  given  in  the  appendix.  These 
examples  were  run  on  the  MACSYMA  [MAC77]  algebraic  manipulation  system  at  Massachusetts 
Institute  of  Technology  which  runs  on  a  Digital  Equipment  Corporation  KL-10  processor.  The 
first  column  is  the  EZ  CCD  algorithm  given  by  Moses  and  Yun  [Mos73]  The  second  column, 
labeled  EEZ2  gives  timings  using  Wang’s  new  EEZ  GCD  algorithm  |Wan79],  but  without  using 
the  heuristic  described  in  this  section.  The  second  column,  labeled  EEZ,  is  Wang’s  algorithm 
using  the  heuristic.  The  final  column  is  the  sparse  modular  algorithm  described  in  this  thesis. 
All  the  times  given  arc  in  seconds. 

The  appendix  lists  three  sets  of  polynomials,  /,,  &  and  d,.  The  four  algorithms  were  used 
to  compute  the  GCD  of  ftd,  and  g,di  after  these  two  products  were  multiplied  out. 


V 

EZ 

EEZ2 

EEZ 

Sparse  Mod 

1 

.036 

.058 

.040 

2 

.277 

1 

.416 

3 

.431 

.785 

.537 

MM 

1.288 

1.224 

.704 

.842 

5 

3.128 

7.331 

1.410 

1.825 

MM 

* 

7.428 

1.966 

3.364 

a 

♦ 

10.282 

1.628 

4.190 

8 

* 

* 

2.446 

9 

* 

* 

2.346 

10 

* 

* 

2.832 

8.202 

The  GCD  computed  here  had  4  terms  and  the  degree  of  each  variable  was  3  or  less.  The 
cofactors  had  the  same  parameters.  Considered  as  a  factoring  problem,  we  would  have  t  =  10, 
D  =  3.  So  we  would  expect  the  timings  to  level  off  at  about 

„  —  2  !°g  l-  _  2l°g10  __  o  oo 

log(/7 -j- i)  log  4 

In  fact  the  timings  level  o(T  between  5  and  6.  That’s  not  too  far  off. 

In  the  final  section  we  give  some  timings  for  problems  for  which  all  algorithms  but  the  ones 
described  in  this  thesis  fail. 
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3.  Analysis  of  the  Sparse  Hensel  Algorithm 


We  will  assume  we  are  using  the  sparse  Hensel  algorithm  to  lift  a  solution  io  =  x •  •z|)n, 
of  the  system  of  equations  /,  [z{ )  =  0  modulo  m  =  (Xi  —  a },...  ,XP  —  o,,)  to  a  solution  id 
modulo  m'/+l.  We  will  let  t  denote  the  maximum  of  the  number  of  terms  in  any  x for  t  =  1 
through  n.  In  keeping  with  our  concern  for  sparse  polynomials  we  assume  that  t  <?C  (d  +  1)“* 
The  solutions  modulo  m'^1  are  determined  by  lifting  the  solutions  we  have  modulo  in 
to  solutions  modulo  mi  =  ( X\  — ai)rf"*‘lm,  which  are  then  lifted  to  solutions  modulo  m2  = 
(X2  —  a2)‘/+1m(  and  so  on.  Since  we  pass  from  a  solution  modulo  m,  to  one  modulo  m,  ,  (  we 
will  introduce  a  new  system  of  equations  and  unknowns,  f, (z, ) .  The  number  of  unknowns  will 
never  exceed  n  t  although  the  number  of  equations  may.  Though  these  additional  equations 
may  be  used  to  significant  advantage  in  practice,  as  was  pointed  out  in  section  VIII. 2,  we  will 
ignore  them  in  the  following  analysis. 

Following  the  same  principles  used  in  the  analysis  of  the  modular  algorithm,  we  assume  that 
the  number  of  terms  in  the  solution  of  /,(!,)  =  0  modulo  m,  is  t.  We  will  also  only  count  integer 
arithmetic  operations  now  and  multiply  by  the  appropriate  factor  at  the  end  to  account  for  e. 
/,(!,■)  will  involve  no  more  than  ni  unknowns  z^\  . . .  This  is  important  as  it  gives  the  size 
of  the  Jacobian  which  needs  to  be  inverted  at  each  stage.  As  we  arc  only  considering  classical 
algorithms,  we  can  assume  that  inverting  the  Jacobian  will  require  about  (»  t)3  operations. 
Performing  an  update  of  the  unknowns  involves  multiplying  a  matrix  by  a  vector,  this  requires 
about  2 (n  t )2  operations,  but  it  must  be  performed  d  times.  We  get  the  following  formula  for 
the  time  required,  excluding  the  computation  of  the  error  terms. 

Oy)((n  <)3  -f  2 d(n  ■  f)2)^ 


which  is  dominated  by  v(nt)3. 

The  error  terms  must  be  computed  d  times  for  each  variable  or  dv  times  in  all.  The 
computation  of  an  error  term  is  merely  a  special  case  of  verifying  the  answer.  Without  knowing 
the  particular  problem  under  consideration  we  can’t  say  much  more.  Denote  by  T  the  time 
required  to  verify  the  answer.  We  can  use  dvT  as  an  upper  bound  on  the  amount  of  time 
required  to  compute  the  error  terms. 

Thus  the  total  number  of  integer  operations  required  is 

dvT  -f-  u(n  •  t)3. 

The  total  size  of  the  answer  t,0 ,  is  n  t,  so  we  can  write  this  as  dvT  -f-  vt30l.  Including  the  factor 
due  c  computed  in  section  4.2  we  have 

0^(dvT  +  vt3r)l)j  log \dv2U—1). 


4.  Timings 

In  this  section  we  will  present  the  timings  for  a  two  of  problems  using  the  Sparse  Modular, 
Hrowu’s  and  ( -ollins’  Modular,  the  !”/<(!(  ’I)  algorithms  and  Wang’s  new  l',KX  (J(!l)  algorithm. 

The  first  example  was  chosen  to  show  the  sparse  modular  (!( ’I)  algorithm  at  its  best.  The 
polynomials  for  this  test  are  the  second  set  of  polynomials  listed  in  the  appendix.  Again,  there 
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are  three  polynomials  listed  for  each  row,  /,,  g,  and  dt.  This  time  notice  that  the  structure  of 
each  of  the  polynomials  is  identical,  they  only  differ  by  coefficients  The  heuristic  mentioned 
in  section  2  docs  not  help  in  this  case.  The  following  table  gives  the  computation  times,  in 
seconds,  for  the  EZGCD  algorithm,  the  Modular  algorithm,  the  Reduced  algorithm,  Wang’s 
new  EEZ  algorithm  and  finally  the  Sparse  Modular  algorithm. 


0 

EZ 

Modular 

Sparse  Mod 

2 

mm 

.481 

Pfet 

.108 

.312 

3 

6.092 

2.908 

1.074 

S£ 

IEI%51 

64.963 

* 

5.906 

BOB 

II 

282.373 

♦ 

9.075 

6 

* 

* 

* 

60.417 

7 

* 

* 

* 

* 

msm 

8 

mm 

mm 

♦ 

* 

4.953 

9 

n 

■g 

♦ 

* 

10 

* 

* 

8.811 

The  asterisks  indicate  that  MACSYMA  ran  out  of  storage.  As  expected  the  modular 
algorithm  ran  in  exponential  time.  Both  the  EZ  and  the  Reduced  algorithms  ran  out  of  storage. 
This  was  to  be  expected.  In  the  case  of  the  reduced  algorithm  the  size  of  the  PRS  finally  caught 
up  with  it.  This  example  was  carefully  designed  so  that  all  the  GCD’s  were  bad  zero  problems 
for  the  EZ  algorithm.  Thus  when  the  polynomials  were  shifted  the  lifting  process  exploded. 
Since  the  heuristic  discussed  in  section  2  was  not  applicable  in  this  case,  the  EEZ  algorithm 
also  exhibited  exponential  behavior. 

The  second  example  is  from  (Mos73).  This  one  was  specially  designed  to  be  optimal  for  the 
EZ  GCD  algorithm.  It  is  about  as  bad  as  possible  for  the  sparse  modular  algorithm.  The  two 
polynomials  whose  ged  is  taken  are 


(£<  +  ') (X>r'+2)  a.  (E 


Since  the  EZ  algorithm  produces  all  the  terms  of  the  answer  of  the  same  total  degree  at  once, 
very  little  computation  is  done  until  the  final  step,  at  which  time  the  entire  answer  is  determined. 
Here  we  only  compare  the  EZ,  the  EEZ  and  the  sparse  modular  algorithms.  The  others  take 
too  long  or  exceed  storage  capacity  very  quickly.  Since  the  sparse  modular  and  EEZ  algorithms 
determine  the  answer  one  variable  at  a  time,  you  would  expect  at  least  linear  growth  as  the 
number  of  variables  increases.  Since  the  degree  and  number  of  terms  also  increases  wc  see 
significantly  more  than  linear  growth  for  the  sparse  modular  algorithm.  It  is  not  at  all  clear 
why  the  EEZ  algorithm  ran  in  time  so  (  lose  to  that  required  by  the  sparse  modular  algorithm, 
especially  since  they  arc  so  radically  different. 


66 


V 

EZ 

EEZ 

Sparse  Mod 

2 

.116 

.705 

.496 

3 

175 

1.554 

m 

236 

3.368 

2.791 

m 

.341 

6.075 

5.037 

6 

.460 

9.803 

8.299 

7 

.602 

mm 

13.112 

8 

.760 

mm 

19.704 

.944 

28.835 

28.709 

10 

1.142 

38.307 

40.511 
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Chapter  VIII 


Conclusions 


The  basic  idea  advanced  in  this  thesis  is  quite  simple.  By  breaking  a  problem  up  into 
layers,  it  is  possible  to  make  use  of  the  structure  of  the  solution  on  one  layer  as  a  guide  to  the 
construction  of  the  solution  on  the  next  layer.  For  the  polynomial  problems  we  have  considered, 
and  for  multivariate  polynomial  problems  in  general,  this  is  rather  easy  because  we  have  a 
natural  layering  with  which  to  work.  The  introduction  of  a  new  variable  constitutes  passing 
to  a  higher  level.  The  evaluation  homomorphism  (replacing  X,  by  o,)  is  a  means  of  going  to 
a  lower  layer.  Throughout  this  thesis  we  have  used  the  observation,  due  to  Paul  Wang,  that 
the  structure  of  polynomials  does  not  change  significantly  for  sparse  polynomials  under  the 
evaluation  homomorphism. 

In  the  first  portion  of  this  thesis  the  familiar  modular  algorithm  was  modified  to  take 
advantage  of  the  possible  sparseness  in  the  answer.  This  was  accomplished  by  assuming  that 
if  the  image  of  a  group  of  monomials  under  the  evaluation  homomorphism  is  zero  then  the 
coefficients  of  these  monomials  are  exactly  zero.  This  modification  has  the  significant  advantage 
that  it  turns  an  algorithm  that  formerly  required  exponential  time  in  the  number  of  variables 
into  one  that  required  only  probabilistic  polynomial  time  in  the  size  of  the  answer. 

In  the  second  section,  we  resurrected  an  old  formulation  of  Hensel’s  lemma,  namely  Newton’s 
iteration,  and  showed  how  it  could  be  adapted  to  utilize  the  sparscncss  of  the  intermediate 
results.  This  version  of  Henscl’s  algorithm  also  only  requires  probabilistic  polynomial  time  in 
the  size  of  the  answer.  In  addition  to  the  increase  in  efficiency,  we  feel  our  formulation  of 
llcnscl’s  lemma  is  much  easier  to  justify,  pcdagogically  clearer  and  more  widely  applicable  than 
other  formulations. 

There  are  a  number  of  problems  left  open  by  this  thesis  that  are  well  worth  investigating 
in  the  future.  First,  it  would  be  very  interesting  to  sec  a  conventional  expected  time  analysis 
of  the  algorithms  presented  here.  Towards  that  end  much  tighter  bounds  than  those  developed 
in  Theorem  I  of  chapter  IV  must  be  computed.  In  particular  good  bounds  for  very  sparse  goal 
polynomials  would  be  exceedingly  interesting. 

Second,  there  arc  still  a  vast  number  of  applications  of  the  two  basic  algorithms  presented 
here  that  we  have  not  yet  investigated.  Particularly  intriguing  are  problems  of  computing 
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resultants  and  partial  fractions  expansions.  Interfacing  the  Hensel  algorithm  to  a  system  for 
solving  systems  of  algebraic  equations  to  obtain  polynomial  or  rational  function  solutions  would 
be  very  useful  for  a  number  of  problems.  This  sort  of  system  could  also  be  profitably  extended 
to  compute  the  solution  of  algebraic  equations  in  a  prespecified  algebraic  field. 

Finally,  the  basic  principal  of  solving  problems  in  layers  and  making  use  of  the  structure 
of  the  answer  at  one  level  to  aid  in  “lifting”  the  solution  to  the  next  higher  level  seems  to 
be  applicable  to  many  other  problems  besides  those  in  algebraic  manipulation.  Algorithms  for 
certain  restricted  sorts  of  graphs  come  to  mind,  as  well  as  applications  in  information  storage 
and  retrieval. 


Appendix  I 


Polynomials  Used  in  the  Timing  Tests 


This  appendix  lists  the  polynomials  that  were  used  to  test  the  various  GCD  algorithms  in 
sections  VII. 2  and  VII. 4.  The  d,  polynomials  are  the  GCDs  which  arc  computed,  the  /,  and  p, 
the  cofactors.  The  polynomials  that  were  fed  to  the  various  GCD  routines  were  d,/t  and  d,p,. 

d\  =  *?  -f-  *1  +  3 

/i  =  2**  -f-  2*i  -f- 1 
0i  —  +  2*i  +  2 

dj  —  2*5*5  +  *i*2  +  2*i 
h  —  *2  -f  2*5*2  +  *J  +  1 

02  =  *5*5  +  *?*2  +  *1*2  +  x\  +  *1 

d  \  =  *?*?  -f-  *3*;j  +  2** *2*3  +  *1*3 
/l  =  *?  +  *|*3  +  *?*2*  3  +  *1*3  +  *1*2 
03  =  *2*3  -f  2*1*3  +  *3  +  *1 

dl  =  *?*?  +  *j*3*1  +  *5*2*4  +  *2*1  +  *5*2*3 

U  ~  *1  *2*3*1  +  *1*3*?  +  *1*5  +  *5  +  *1*3*1 
04  =  *1*3*5  +  *3*4  +  *4  +  *1  *|*3*4  -f  *l*j 

<4  =  *J*|*5*1*5  +  *1*2*5  +  *?*3*?*5  +  *?*2*?* 4*5  +  *5*2*5*? 

/5  =  *1*5*5  +  *1*2 *5*4*5  +  *  I  *2*5*4  +  *1*2*4  +  l 
05  =  *1  *5*4*5  +  *2*5  +  *1  *2*4*5  +  *2*5  +  *1*2*3*? 

do  =  *1*2*?*5*6  +  *l*2*3*1*5*»  +  *5*3*0  +  *5*2*5*4*5*0  +  *5*3*5*# 
fa  =  *5*2*i*5*5  +  *.*3*5*5  +  *1*5*5  +  *5*5*5*5*#  +  *1*5*43% 
go  =  *5*5*1*5*11  +  *i*?*5*o  +  *5*5*1*5*11  +  *1*5*3*?*#  +  *5*3*5 

dj  —  *i*5*?*5*5  +  *5*i*t*5*5  +  *?*?*?  +  *5*2*?*#  +  *j*i*5 

ft  =  *5 *2*?*5*o*7  H-  *1  *2*1*1, *7  +  *  l*?*5*7  +  *1  *?*5*7  -f  *?*2*3*J*5*» 

07  =  *|  *,*5*5*5  T'  *5*5*1*5*11*7  f-  *1*14*7  +  *5*2* l*5*0*7  +  *?*j*.»*| 
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do  —  £2*t£5*6*7 ly  4~  X j “|“  X ^XjX jXqX-j  "j"  X^X'^X^XxX^X 6X7  |  Xj2<®8 

/o  =  x^xfaxfaxlxl  +  x2x$xlx\  +  x?*****^*?**  +  x\xlx&lx}xs  +  xix\x\x\xi 
98  =  Xix'lxsxsxixl  +  xixlx]x\xlxn  +  *?X2*3*4*£* g  +  xfc^xix^xo  +  11X2*4*5 

da  =  x^XjXiXfyXfixl  4-  x,xjX3XiX%x«xo  4-  22*3*12;;  £s  29  +  x\x\x\x\x\nx\  -\-  X2X3nx\x^nx\ 
fg  =  Z,  £2*1*7*829  4-  x\xg  4-  X?*3*4*I*ft*?  4-  *4*5*7  +  *3*4*6*7 

gg  =  zJz2*l*5*to27*K*g  +  zJz2*1*5*b*7*8*9  4~  *?*3*4*0*7*K*9  4"  *|  *2*6*8  4"  *2*4*5*6*7 

d|0  =  ZtZ2*4*8*9*|0  4"  *2*4*5*0*7*9*10  +  *,*2*3*5*7*9  4~  *1*7*427*9  4"  *| *3*1*7 *8 

/l0  =  2| 22*4*4*li*728*9*M)  4~  *2*3*1*0*9*?0  4~  *1*2*3*1*5*0*7*8*9*10 

4-  *^£2*4*5*8*9*10  4-  £3*4*5*6*7*9*10 
9lO  =  *1  *2*3*5*0*7*8*9*10  4-  *1*0*9*10  4”  *1*2*3*1*5*0*8*9*10 

4*  *1  £3*0*7*0*10  4*  *1*5*0*7*9 


The  following  set  of  polynomials  was  used  for  the  examples  of  section  VI. 4.  Notice  that 
the  structure  of  each  of  the  polynomials  in  each  triple  is  the  same. 

fi  =  18*1*1  -}-  2xJ*2  +  18*2  +  10*1  +  18*i 
02  —  16*!*]  -f- 1 1  x?*2  -f-  Ox?  +  4*1  +  12xi 
d.2  =15*1*1  +  Uxfa  +  11X2  +  19*1  +  7*1 
/i  =  16X1*2*3  4~  14*2*3  4"  3*?*3  +  18*1*3  4“  *1*2 
0)  =  19*1*2*1  +  2*1*1  4"  12*J*1  4~  *1*3  4~  10*1*2 
di  =  7*1  *1*1  -f- 13*1*1  +  3*1*1  +  4*1*3  +  14*1*2 
f\  =  5  *'J*2*1*1  +  3*1*3*'!  +  16*1*1*1  +  19*1*1  4~  4*3  +  19*1 
g,  =  19*1*2*1*1  +  9*1*3*!  +  18*1*1*1  -f  17*1*1 
d.t  —  19*1*2*1*1  +  18*1*1*3*1  -j-  18*1*1*1  +  ll*f*3*!  +  3*1*1*1 
f5  —  17*2*1*4*1  +  10*!*]*1*1  -|-  *1*4*5  +  2*2*1  4"  10*1*1 
35  =  17*2*1*4*1  +  3*,1*!*!*!  +  18*1*4*5  +  10*1*1 
d5  =  2*2*j*4*5  +  3*1*1*1*1  +  12*1*4*5  4*  10*2*1  4-  18*1*1 
/6  =  7*1*]*1*1*1*1  +  2*!*!*1*1*5*1  4-  9*l*l*b  +  19*1*1*1  +  3*i  x\x\x\ 

Ste  =  9*f*l*l*l*l*l  +  15*1*2*1*1*5*6  4-  4*1*1*6  4-  *3*1*2  4-  18*,  *1*1*1 
=  13*1*2*1*4*5*1  4-  12*1*2*1*1*5*6  4-  14*1*1*6  4-  17*1*1*1  +  8*1*1*1*1 
/?  =  10*1*1*4*6*7  4-  *1*1*;!*!*?  4-  1 4*1  *2*  1  * 5*6*7  4*  13*?*1  4"  7*1*1*!*1*1 
=  4*1*  1*4*6*!  +  17*1*1*1*1*1  +  2*1*2*!*5*6*7  4-  5*1*1  4-  13*1*1*1*1*1 
d7  =  7*1*]*4*b*1  +  17*1*1*3*1*1  +  19*112*1*1*1*1  +  2*1*1  -f  5*1*:]*1*1*1 


k  —  l4ziXixlx5 x\  -f- 18 x7txf2xlxlxlxlx^XB  +  I0x?2xlx^x^  -f- 1 2xix\xixix\x(,  +  lz\x\x\xt 

gw  =  llxiiixlx5xl  +  +  12*2X3X5X6*7  +  17x1X2X3X1X5*6  +  xjxji^xe 

dts  =  16x1X2X4X5*^  4-  I5x'lx2slz24zlzlxizx  -f-  6X2X3X5X6X7  +  14xix^x3x4x|x6  +  8x^x^xe 


f}=  2X2X3X5X5X11X9  -f  xf*2*3X5X^Xg  +  x{*3 X^XsXgXrXgXg  +  1  lxJxsXjX^XjXTXBXg 

4  I6X2X4X6X7X6 

gg  —  XjXjXjXgXiiXg  4  lOx/XjXjXlxfXg -f  13xJx^X4X^xJx7XgX9  1 5*j X2X3X4X5X7X8X9 
-f  1 7*2  X4  *5X7X8 

dg~  912X3X5X5X6X9  -f  Gx'l^XjXr.XjXHXg  4-  X'fXjXjIjI^XlXgXo  4-  10xJx2X3xJx£*7x!x9 
4-  12x2X4*6X7*! 


ZlO=  18xfx3XtX6*7X«xJ0  4-  15x]x2x!x!x7x2*J*io  4-  XjX^lJx^XgX^  4-  1  lx]z*Xfix\0 

4-  13X1X)*!*!*;*! 

gio=  13xf*3X.ix!x?x«xJ0  4-  17*!x3*4X7*!x9xJ0  4-  6*?*2*6*?0  4- 2*1  Xj*!*!*}*! 
dio  =  9x^*3 XlXgX^XHXjQ  4  17x,'x2x!x!x7x!xgX'j0  4-  nX^XjXjXj’x!*)*^  4-  3xjx|x!xl02 
4-  10x,x3x!x!x}x! 
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